Why Crawl Budget and URL Scheduling Might Impact Rankings in Website Migrations

why-crawl-budget-and-url-scheduling-might-impact-rankings-in-website-migrations

During a migration, many webmasters will notice that there is turbulence happens in PageRank, this is because all signals impacting rankings haven’t passed to the new pages yet, so they assume that PageRank was lost. Besides, Googlebot also needs to collect huge amounts of data for collation in logs, mapping and updated internally, and rankings which can fluctuate throughout this process. If you are a SEO service engineer or web developer, you may need to read the following passages to understand why website migration can impact on their PageRank.

Crawl Budget = host load + URL scheduling combined

URL scheduling is important since they will show what does Googlebot want to visit (URLs), and how often?” while host load is based around “what can Googlebot visit from an Ip/host, based on capacity and server resources?” Both of them still matter in migrations, together, these make up “crawl budget” for an IP or host.

This will not bring a lot of impact, if you only have few pages of websites, but this things terribly matter when you have an e-commerce of news site with tens of thousands, hundreds of thousands, or more URLs. Sometimes, even crawling tools prior to migration “go live,” cannot detect any wrongs but the result will show that there any rankings and overall visibility drops.

This can be caused by “any late and very late signals in transit”, rather than “lost signals.” In fact, some signals could even take months to pass since Googlebot does not crawl large websites like crawling tools do.

Change Management/Freshness is Important

Everyone knows that change frequency impacts crawl frequency and URLs change all the time on the web. Keeping probability of embarrassment for search engines (the “embarrassment metric”) by returning stale content in search results below acceptable thresholds is key, and it must be managed efficiently. In order to avoid any “embarrassment”, scheduling systems are made to prioritize crawling important pages which change frequently over less important pages, such as those with insignificant changes or low-authority pages.

These kinds of key pages will be easily seen by search engine users versus pages which don’t get found often in search engine results pages. This also shows that search engines learn over time the important change frequency on web pages by comparing the latest with previous copies of the page to detect patterns of critical change frequency.

Why can’t Googlebot visit migrated pages all at once?

The above explanation has given us two conclusions; first Googlebots usually arrive at a website with a purpose, a “work schedule,” and a “bucket list” of URLs to crawl during a visit. Googlebot will surely complete its bucket list and checks around to see if there is anything more important that the URLs on the original bucket list that may also need collecting.

Furthermore, if there is important URLs, Googlebot may go a little further and crawl these other important URLs as well. If nothing further important is discovered, Googlebot returns for another bucket list to visit on your site next time.

Since Googlebot is mostly focusing on very few (important) URLs,  wheterh you’ve recently migrated a site or not, with occasional visits from time to time to those deemed least important, or not expected to have changed materially very often.

Moreover, Googlebot will likely send a signal to tell us if there is a migration of some sort underway over there when Googlebot comes across lots of redirection response codes. Once again, mostly only the most important migrating URLs will get crawled as a priority, and maybe more frequently than they normally would, too. Due to this, it is importance to know several factors, aside from page importance and change frequency that would make URLs be visited. They are limited search engine resources, host load, and URL queues an low importance of migrating pages.

Starter’s Guide to Regular Expression (Regex)

beginners-guide-to-regular-expression-regex

A regular expression is a range of characters forming a pattern that can be searched in a string which is usually used for validation, for example, for validating credit card numbers or for replacing matched text with another string. Moreover, it also has great multiple language support-learn it once and you can use it across many programming languages.

The advantages that REGEX offers may put this function in limelight, but not many developers are interested in using REGEX, in fact, not few people who take a first look at regex, and ignore it completely. Therefore, few developers and web developers can surmount the complexity of REGEX. However, if one can manage to use it, it will produce you with better and faster searching results.

If you get used to JavaScript, you still need to learn all the characters, classes, quantifiers, modifiers, and methods used in regex.

Let’s see a simple example with an explanation. This is a regex.

B[a-zA-Z\d]+

The above regex will look like this in a line, a character ‘B’ followed by at least one of any character between (and including) ‘a’ to ‘z’, ‘A’, to ‘Z’ and numbers 0 to 9.

Here’s a sample of matches in a line highlighted:

Basket, bulb, B12 vitamin, BaS04, N BC company

The above regex will stop the search at

Basket

And return a positive response. That’s because the global modifier ‘g’ has to be specified if you want the regex to look into all the possible matches.

Below are several ways on how to use this expression in JavaScript. The method goes: if found a match return true, else false.

1

2

3

4

5

var input = “your test string”, regex = /B[a-zA-Z\d]+/;

if(!regex.test(input))

alert(‘No match is found’);

else

alert(‘A match is found’);

Let’s try another method: match returns the matches found in an array.

input = “your test string”,

    regex = /B[a-zA-Z\d]+/g,

    /*I’ve added the global modifier ‘g’ to the regex to get all the matches*/

    ary = input.match(regex);   

if(ary===null)

    alert(‘No match is found’);

else

    alert(‘matches are: ‘ + ary.toString());

How about string replace? Let’s try that with regex now.

1

2

3

var input = “your test string”,

regex = /B[a-zA-Z\d]+/g;

alert(input.replace(regex, “#”));

10 Terminal Shortcuts Developers Need to Know

10 terminal shorcuts that every developers should know

There is plenty of web development software that a developer can try to help them work efficiently. However, knowing some of keywords combination will come in handy when it comes to fasten your work process. Below are 20 keyboard shortcuts on OS X that will make life easier if you’re working in terminal.

  • Option/Alt+Left or Right

Have you ever wondered what shortcut that allows you to move the cursor between separate words in a command line. Use option and the left arrow to move back and use option with the right arrow to move forward down the line.

  • Escape + T

You can use the combination keywords of Escape + T to swap the two words that appear immediately before the cursor.

  • Control + R

The combination of Control + R will locate a previously used command in Terminal since it will open up [(reverse-i-search)`’:] and allow you to find a previously used commands that you may need to access again.

  • Control + C

To abort the current application and kill what’s currently running, you can use Control and C.

  • Control + U

If you have worked until the end of a line and realize the whole line is wrong, don’t worry since you can clear the entirety of the line before the cursor by using control and U to delete it all.

  • Control + K

Control K works oppositely and produces the opposite effect from Control and U. This will clear the line that appears after the cursor which is helpful when you need to change or delete the latter half of a line.

  • Command + K

If you are looking for a combination button to delete everything you’re working on, the Command + K combination will clear it all or you can also use Control and L or by typing “clear” into terminal.

  • Control + Z

Control + Z is highly recommended when it comes to suspend what you are currently running in the background. This action will help to execute the last command entered. Bear in mind to try entering Sudo before, if you run into permission issues.

  • History + a Number

If you happen to lose track of a command you type earlier, you can type “history” into Terminal to retrieve a history of your commands or you can simply type a space then a number after history. Therefore, “history 5” will show you the last five commands you typed.

  • Escape + B

Try using an alternate way of moving the cursor back by one word through the combination of Escape + B, like you would do using the option and left arrow shortcut.