Project Nemo was an idea that came out of a hackathon that later turned into reality with investment from senior leadership. I pitched this idea to members from product, audience growth, SEO, and editorial and we formed the team that not only won this hackathon but continued this project throughout the year.
SEO contributes to nearly 40% of our monthly active users, and its performance is greatly determined by the way content is written. At SCMP, we have 300 journalists publishing news articles every day and they are often extremely busy. While some editors are interested in increasing their SEO traffic, they often do not have the time to do any SEO research to figure out which keywords they should use or which topics they should write more about.
In order to solve this problem, we created Project Nemo, which leverages natural language processing to scan our article content and extract or suggest SEO friendly keywords that were relevant to each article. Our team decided to codename this project “Nemo” because finding the right keywords is often as challenging as searching for the fish in the Pixar film, Finding Nemo.
In the process of extracting keywords, we have done a series of data cleaning (eg. removing stop words) and adopted three NLP models: TF-IDF, YAKE, and SPACY. The keywords are then fed into SEO traffic related API including Google Trends, SEMRush, and internal SCMP traffic data to generate a list of keywords with their search trends.
1. Integrated to Swing - our internal publishing tool
We have integrated Nemo with Swing, the internal publisher tool that editorial is using to write and publish their articles. In swing, we started with adding 1) a keywords section that attracts keywords from the article and attaches search data to each keyword. After building 1), we have run a series of beta tests with specific editorial desks for feedback. Our next step is to build 2) a suggestor tool that suggests more important keywords that can be included in the SEO headline, index headline or bullet summary.
2. Evergreen article keyword alerts
Our next goal is to create an evergreen article keyword alert tool that can analyze the traffic of past SCMP articles and identify keywords with a spike traffic. For example, if there are a group of articles about “Daniel Zhang” (CEO of Alibaba Group) in the Technology section with a spike in traffic, we can alert the newsroom to write more articles about the topic.
We can then aggregate these data insights to create dashboards for specific desks with trending SEO keywords tailored to the content that they are writing. These dashboards are more helpful than common Google Trends, MOZ, or SEMRush tools because they are tailored specifically for SCMP, combining article insights from its pool of historically published content.
After running a series of beta tests and feedback sessions with editorial, we have received great feedback about our tool as it has helped them in their process of identifying SEO keywords. It's hard to completely isolate the impact of Nemo on search traffic (as we cannot run SEO-related A/B tests on published articles due to issues with journalistic integrity), but organic traffic of the specific desks that we beta-tested have increased by 30% in the two months of testing. The marketing & SEO team have also seen tangible improvements in keyword search ranking of selected keywords for the desks.