Getting Data From the Wikipedia API Using jQuery

Where would we be without Wikipedia? It’s hard to believe how we survived without this icon of the Internet. A novel idea of being an encyclopedia that anyone can edit, it’s always kinda interesting how you can look up an article about one thing and then end up somewhere completely different as you jump from article to article by clicking through its link-heavy content. But more important than that, Wikipedia is the authoritative mediating source of arbitration in all Internet arguments where the combatants usually don’t have anything other than extremely biased and inflammatory blogs as source material at hand and are too lazy to do their own research using actual respectable academic sources.

I’ve even heard that this spills over into the “real world.” Wikipedia has reported that it sees a huge spike in traffic coming in on mobile devices on Friday and Saturday nights and the underlying cause of this is believed to be attempts to settle disagreements than break out from drunken arguments in bars.

Wikipedia

So, all things considered Wikipedia usually has a good base on information just about any subject under the sun. And it got me thinking that a lot of websites and applications could benefit greatly from having an easy way to pull data from Wikipedia and use it to populate content. Fortunately, Wikipedia is built on some software that does provide an API for doing this. I remember seeing the super-slick Google Chrome experiment 100,000 stars put out by the folks at Google used Wikipedia excerpts as a brief description of each star. So I thought I’d look into the process of pulling data from Wikipedia using JavaScript and write it down because I figure someone else would find it useful at some point. Because other wikis are built upon the same foundation as Wikipedia, itself, this is not limited to getting data from Wikipedia alone. You can query other Wikis as well.

So to start, we’re going to need an article on Wikipedia to hit. Let’s use Jimi Hendrix. First we’re going to need to figure out what URL to call. I actually decided to use jQuery just because it’s a bit easier to parse through the returned data, but all of this could be done in native JavaScript if you wanted. So we’ll be making use of jQuery’s ajax method to make our asynchronous call.

So according to the API, the way that we call our Wiki API, is by using an endpoint like the following…

http://en.wikipedia.org/w/api.php?format=json&action=query&titles=Main%20Page&prop=revisions&rvprop=content

This is just one example. The API reference lists all of the possible actions. Reading through the docs, we can see that we’re probably going to want to specify JSON as the returned format. I chose the parse action and prop=text because I want to get the text out of the page. If we do this, rather than pulling down the entire page of text, let’s just say we want to get the top section (or blurb of Wikipedia data). To do this, we specify section=0. If you omit the “section” parameter, the entire page of data will be pulled down. It’s probably outside the scope of this article to go into a detailed explanation of what each of these actions do specifically, so if all of this seems a little overwhelming and it feels like we’re moving too fast, take a look through the API documentation to get a more detailed description of what each of these components do.

So we know at the very least we’re going to need something like what is below…

$(document).ready(function(){
    $.ajax({
        type: "GET",
        url: "http://en.wikipedia.org/w/api.php?action=parse&format=json&prop=text§ion=0&page=Jimi_Hendrix",
        contentType: "application/json; charset=utf-8",
        async: false,
        dataType: "json",
        success: function (data, textStatus, jqXHR) {
            console.log(data);
        },
        error: function (errorMessage) {
        }
    });
});

If we try to make this call, nothing happens. This is because we are being blocked by the Same-origin policy. Just simple JSON is not going to suffice, so we’re going to need to trigger JSONP (JSON with Padding) by adding in a callback parameter callback=?.

Now, our AJAX call looks like the following…

$(document).ready(function(){
    $.ajax({
        type: "GET",
        url: "http://en.wikipedia.org/w/api.php?action=parse&format=json&prop=text§ion=0&page=Jimi_Hendrix&callback=?",
        contentType: "application/json; charset=utf-8",
        async: false,
        dataType: "json",
        success: function (data, textStatus, jqXHR) {
            console.log(data);
        },
        error: function (errorMessage) {
        }
    });
});

If we look in our console now, we can see we have data from Wikipedia! Wow! If we take a look at this object we can see that the part we are interested in is shown below.

{
    warnings: { ... },
    parse: {
        text:{
            *:{
                 "<div class="dablink">This article..."
             }
        }
    }
}

Those maybe are not the key/value pairs I would have chosen but oh well. This means that to get the markup that we need out of the object, we just have find the key. And if we add a div to our page with an id, e.g. <div id=”article”></div>, we can dump the markup into this div like so…

$(document).ready(function(){
    $.ajax({
        type: "GET",
        url: "http://en.wikipedia.org/w/api.php?action=parse&format=json&prop=text§ion=0&page=Jimi_Hendrix&callback=?",
        contentType: "application/json; charset=utf-8",
        async: false,
        dataType: "json",
        success: function (data, textStatus, jqXHR) {
            var markup = data.parse.text["*"];
            var blurb = $('<div></div>').html(markup);
            $('#article').html($(blurb).find('p'));
        },
        error: function (errorMessage) {
        }
    });
});

Note that there are a few other issues with our returned data (like warnings and links not working) so I’ve added a few extra items below to clean things up a bit…

$(document).ready(function(){
    $.ajax({
        type: "GET",
        url: "http://en.wikipedia.org/w/api.php?action=parse&format=json&prop=text§ion=0&page=Jimi_Hendrix&callback=?",
        contentType: "application/json; charset=utf-8",
        async: false,
        dataType: "json",
        success: function (data, textStatus, jqXHR) {
            var markup = data.parse.text["*"];
            var blurb = $('<div></div>').html(markup);
            // remove links as they will not work
            blurb.find('a').each(function() { $(this).replaceWith($(this).html()); });
            // remove any references
            blurb.find('sup').remove();
            // remove cite error
            blurb.find('.mw-ext-cite-error').remove();
            $('#article').html($(blurb).find('p'));
        },
        error: function (errorMessage) {
        }
    });
});

You can see the demo below. I have also wrapped this into a jQuery plugin called Wikiblurb.js that you can find on GitHub. The plugin has more options than our simple example here to make the portions of your wiki that you’re grabbing much more customizable.

View Demo
9bit Studios E-Books

Like this post? How about a share?

Stay Updated with the 9bit Studios Newsletter

38 Responses to Getting Data From the Wikipedia API Using jQuery


  1. May 22, 2014 at 9:24 pm  
    Great introduction to Wikipedia API. I spotted a little typo, In the 1st and 2sd listing you have an unwanted "}" after the console.log call ;)
  2. Robin

    May 29, 2014 at 6:00 pm  
    Thanks for that good tutorial! Really helps!
  3. NeaM

    November 26, 2014 at 1:23 am  
    Please How can I include the picture ? cause it displaying only text for me ? thanks

    • November 27, 2014 at 2:43 pm  
      Hey there NeaM -- You might have to run it in a server environment or a local server like WAMP or MAMP. I pulled down the GitHub package and it didn't seem to want to load the image from there. However, putting the package in a local server environment pulls everything (including the image) down fine.
  4. George

    November 30, 2014 at 9:48 pm  
    Thanks for wasting my freaking time. Doesn't work!

    • December 10, 2014 at 8:11 am  
      Hey there George -- Sorry it is not working for you. I just pulled down the current GitHub demo .zip and ran it from the desktop and it's working fine... though as mentioned above, for full functionality you may need to run it in a server environment or a local server like WAMP or MAMP. What does your console say? Any errors?
      • thomas

        March 9, 2016 at 5:59 am  
        hey Ian thanks for the detailed tutorial. i learned a lot from that. and i want to say its fascinating how calm you stay if i would write a tutorial and see a comment like george`s i would just tell him to fuck off. regards and thxs again thomas

      • March 12, 2016 at 5:58 am  
        Thanks Thomas! I'm glad that it was helpful for you. Maybe I'm completely numb to it because I've seen far worse said over Internet. Not to me necessarily, but others to each other. Let me know if you need help with anything else.
  5. David

    November 29, 2015 at 1:04 am  
    Hi Ian, I'm trying to add something like this to pages on a wordpress site. Could you explain how best to install the code please? Thanks

    • December 1, 2015 at 9:22 am  
      Hi there David -- It's really just a matter of including your script in the page after jQuery and calling it. For WordPress the recommended way of installing scripts is to use the wp_enqueue_script function. You could also do something like this...
      <script src="<?php get_template_directory_uri() ?>/js/jquery.wikiblurb.js" type="text/javascript"></script>
      in your header.php or footer.php theme files.
  6. Tim

    December 9, 2015 at 6:53 pm  
    Hi, this is really helpful. I was wondering what if I just wanted all the links created to lead back to a single webpage (ie the main wikipedia page). Would you be able to explain this? Thanks.
  7. Pablo

    March 9, 2016 at 2:55 pm  
    Thanks a lot, Ian! your explanation helped me get the API to work with my little project!

    • March 12, 2016 at 5:59 am  
      No problem Pablo! Glad you found it helpful!
  8. Ty

    April 30, 2016 at 10:25 pm  
    Thanks Ian. Was able to use your code to get a call to the wikipedia API done. Thanks heaps and keep up the great work :-)
  9. prateek

    May 18, 2016 at 5:50 am  
    i want get data from wiki and pass the information to my database. Any Help?

    • May 20, 2016 at 8:36 am  
      If you need to put information into your database, I think you're probably better off making the same API call server-side. This is more suited for going in the response direction.
  10. Sage

    May 19, 2016 at 7:32 am  
    Hi Ian, I'm really glad to have found this it's exactly the kind of thing I was looking for. I do have a question, how would you go about changing the page to be a user-input based variable? I'd like to make use of a search form that pulls up the specified page on submit. Thanks again, and I hope you can get back to me!

    • May 20, 2016 at 8:31 am  
      Hi there Sage -- To do something like this you'd probably want to pass that info forward via query string from the form submit, get that page value out of the query string and pass it to the plugin on this next page.
  11. Hugo Barbosa

    June 8, 2016 at 12:29 am  
    Great example... I spent hours searching for a simple way to get the basic info from a search... and here your site and info had what I wanted... Now... How could I fetch the main image from the article I looked for, and add it on top on my html dump. Any pointers or ideas?

    • June 11, 2016 at 4:18 pm  
      Hi there Hugo -- I actually made this blog post into a plugin called Wikiblurb which gives you the option of getting the "infobox" (which includes the image) *or* you can specify your own custom selector(s) to grab. Give that a go and let me know if you have any troubles with it. Thanks!
  12. izumi

    June 10, 2016 at 10:27 am  
    hello, i do copy ur code to try it out.. and i want the wiki article paste on my page.. how i want to do that... example.. document.getelementbyid("output").innerhtml= ;.. after innerhtml what should i type
  13. izumi

    June 10, 2016 at 1:05 pm  
    to admin im sorry about my question just now, i just learn javascript so it confusing but now i already understand.. tq lot... but/... i want to ask u.. for the search.. i want to replace it with my defined var but it seems that if i insert one word is okay... but for two word... do i have to put underscore when inserting more words... or is there any ways that i can do about that
  14. izumi

    June 10, 2016 at 1:06 pm  
    i mean.. i replace jimi_hendrik with input(my defined var)

    • June 11, 2016 at 4:12 pm  
      Hi there izumi -- Yes you do need the underscores but fortunately I made this blog post into a plugin which can be found here. In the initialization it looks for spaces and will insert the underscores if needed, so all you need to do is call it with the name of whatever page it is you're trying to find.
  15. wani

    June 13, 2016 at 10:58 am  
    hello, i want to replace the jimi hendrix with input that i get from user, but i notice that if user insert two word it will produce nothing.. i know the problem is because it need hv to include '_' but how to just insert two word input normally and still get the output.. thankyouuuu

    • June 14, 2016 at 4:14 pm  
      I actually made this blog post into a plugin called Wikiblurb. In the initialization there it looks for spaces and will insert the _ characters if needed, so all you need to do is call it with the name of whatever page it is you're trying to find.

  16. June 21, 2016 at 11:37 pm  
    Thanks for the plugin. It works great. We are creating a module, where we will upload a csv file (list of pages for query ), and it will fetch all data and insert into db, on the basis of your module. Will share here once done. Thanks again for this plugin.

    • June 22, 2016 at 6:23 am  
      No problem! Glad you found it helpful!
  17. Weeb

    August 30, 2016 at 11:54 pm  
    What can i do if i want to get more articles that are related to the keyword i search for?
  18. sam

    September 14, 2016 at 9:50 am  
    For the customization of your plugin, how can you make it so that it grabs certain pieces of data from the wikipedia page. For example, if I just want name and birth date with a summary, how could I do that?
  19. Chuck

    December 16, 2016 at 2:49 pm  
    great work! I appreciate it. Is there anyway to get this to work with the links working inside of the blurb to where it opens the next page in "blurb" format as well? Thanks!

    • December 18, 2016 at 6:22 pm  
      As things are implemented right now in the callback after the data loads you would have to replace all the links in the container that the Wikipedia data is loaded into with a link to a second page or page that ran the plugin on load off of some parameters or something.
  20. Rosa

    January 16, 2017 at 9:27 pm  
    How can i display all sections?

    • January 17, 2017 at 5:38 pm  
      Hi there Rosa -- As far as I am aware, you would probably have to loop through the sections via multiple requests.
    • iCricket

      June 15, 2017 at 7:37 pm  
      Just remove §ion=0 from the url. @Ian thanks alot mate! your "markup" variable has saved life. was struggle to cast the content because of this "*". anyway how about this? I've a prob to catch the content because "number" https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=Nest
  21. anfelo

    June 19, 2017 at 4:46 pm  
    Hi Ian, thanks alot! This help me to get started with javaScript ajax requests and MediaWiki. I also wanted to get a json object containing a list of articles that matched my entry, I found this helpful link: https://stackoverflow.com/questions/25891076/wikipedia-api-fulltext-search-to-return-articles-with-title-snippet-and-image/25911756#25911756?newreg=8e78222a30d6428c901acb10baf70667
  22. py

    September 4, 2017 at 11:07 am  
    Merci Ian ! Lost in the 'Random Quote Machine' challenge of freecodecamp.org, your library saved me another day of errand. With some modifications I have been able to produce an array filled with the gorgeous poetry of Guillevic (you may read those marvelous lines on this page (in french, sorry)) : https://fr.wikiquote.org/wiki/Carnac Thanks, thanks a lot ! py
  23. Theo

    September 15, 2017 at 1:11 pm  
    Hi Thank you for the post. Links are working. This is the idea:
    //working links
    blurb.find('a').each(function() {
    var wikiAttr = $(this).attr('href');
    var hash = "#";
    var www ="www";
    if(wikiAttr.indexOf(hash) === -1 && wikiAttr.indexOf(www) === -1){
    $(this).attr('href', 'https://de.wikipedia.org' + wikiAttr).attr('title', 'https://de.wikipedia.org' + wikiAttr).attr('target', '_blank').addClass('myWikiLink');
    }
    if($(this).hasClass('external')){
    $(this).attr('title', wikiAttr).attr('target', '_blank');
    }
    });
    regards theo p.s.:you may have to alter css and inspect the wiki dom

Leave a Reply