Your own movie database in 5 minutes with IMDb API and Perl

In this post I will show you how you can easily import IMDb data for your movies and process the XML to get SQL for database import. From there you can start to build your movie site.

IMDb API

In this post I use a small DVD collection. The movie data (director, actors, year published, etc.) are from IMDb and thanks to IMDb API (Brian Fritz), you get the data easily via curl. At the end of this post you can see a 5 min video demo of the whole process.

How it works

  • First I create a database and insert the movie collection table
  • The script getMovieData.pl serves to import the movie data from the API with curl and converts the XML to SQL. The escapeSingleQuote function allows the output to contain single quotes.
  • The batch.pl script is a little wrapper to run the getMovieData.pl on each title of the movie list you provide as input file. So for the example list you get 41 INSERT statements.

Conclusion

In 3 simple steps you get a complete set of data for each movie title, in a standardized format. Importing this data in a database allows for easy app development. You can quickly lookup what movies you have of your favorite actor or director, what movies were released in 2009, which movies were highest rated or had the most votes, etc. Having the data in a database, makes life easier :)

An example in PHP

example_movieCollection_site
- code -

Video Demo

Feedback

Update 22.11.2011

Comments and suggestions regarding this post on hacker news. There are some improved versions of the perl script, for example here. I appreciate your feedback to improve my Perl skills. What the IMDB TOS is concerned: this is for personal use only.

  • Guest

    I am getting these errors. help plox. thnx

    • http://bobbelderbos.com/ Bob Belderbos

      what errors?

  • http://bobbelderbos.com/ Bob Belderbos

    I downloaded the script (http://bobbelderbos.com/src/moviecollection/getMovieData) and it runs fine: perl -c says syntax is ok, and reproducing the printscreen:

    $ perl getMovieData.pl “the town”
    INSERT INTO movie_collection VALUES (NULL , ‘The Town’, ’2010′, ‘R’, ’17 Sep 2010′, ‘Crime, Drama, Thriller’, ‘Ben Affleck’, ‘Peter Craig, Ben Affleck’, ‘Ben Affleck, Rebecca Hall, Jon Hamm, Jeremy Renner’, ‘As he plans his next job, a longtime thief tries to balance his feelings for a bank manager connected to one of his earlier heists, as well as the FBI agent looking to bring him and his crew down. ‘, ‘http://ia.media-imdb.com/images/M/MV5BMTcyNzcxODg3Nl5BMl5BanBnXkFtZTcwMTUyNjQ3Mw@@._V1_SX320.jpg’, ’2 hrs 5 mins’, ’7.6′, ’104803′, ‘tt0840361′, ’1329433694′);

  • http://bobbelderbos.com/ Bob Belderbos

    same here, downloaded it and

    $ perl test.pl “Predator”

    works perfectly:

    $ perl test.pl “Predator”
    INSERT INTO movie_collection VALUES (NULL , ‘Predator’, ’1987′, ‘R’, ’12 Jun 1987′, ‘Action, Adventure, Sci-Fi, Thriller’, ‘John McTiernan’, ‘Jim Thomas, John Thomas’, ‘Arnold Schwarzenegger, Carl Weathers, Kevin Peter Hall, Elpidia Carrillo’, ‘A team of commandos, on a mission in a Central American jungle, find themselves hunted by an extra-terrestrial warrior. ‘, ‘http://ia.media-imdb.com/images/M/MV5BMTM1Njk0MjIwN15BMl5BanBnXkFtZTcwNzgwNzEyMQ@@._V1_SX320.jpg’, ’1 hr 47 mins’, ’7.8′, ’122549′, ‘tt0093773′, ’1329945077′);

    what code is at your line 14 ?

  • http://bobbelderbos.com/ Bob Belderbos

    never done it either. I guess Google is your friend and/or try another OS (virtual machine)

  • http://bobbelderbos.com/ Bob Belderbos

    Didn’t the blog post back you up? I will keep it in mind for the next time. And no I am not shy to talk lol

  • Fred

    I’m new to PHP, how exactly did you use the php code to create that page from the database?
    I mean I have the code… but what do I do with it?

  • http://horasweb.com/ Horas

    Quite hard to implement the code…

  • http://bobbelderbos.com/ Bob Belderbos

    Hi William, nice to see you could re-use and build upon it. Bob

  • http://www.facebook.com/frederic.soreng Frederic Soreng

    hi !

    did u able to resolve the error u mentioned above. I’m facing the same.

    • http://bobbelderbos.com/ Bob Belderbos

      See solution above …

  • http://bobbelderbos.com/ Bob Belderbos

    the snippet provided via the hacker news thread did give these errors, so I combined another suggested solution and now it works, see new copy of the script: http://bobbelderbos.com/src/mo

    when I run it:

    # perl getMovieData.pl “django unchained”
    INSERT INTO movie_collection VALUES ( NULL, ‘Django Unchained’,’2012′,’R',’25 Dec 2012′,’Adventure, Crime, Drama, Western’,'Quentin Tarantino’,'Quentin Tarantino’,'Jamie Foxx, Christoph Waltz, Leonardo DiCaprio, Kerry Washington’,'With the help of a German bounty hunter, a freed slave sets out to rescue his wife from a brutal Mississippi plantation owner.’,'http://ia.media-imdb.com/image… h 45 min’,”,”,”1367881184);

  • An

    Does this work for tv shows?

  • Tuši X

    Thanks for this page.
    Script needs a little update:

    line 17: my $rating = escapeSingleQuote($data->{movie}->{imdbRating});
    line 23: my $imdb = escapeSingleQuote($data->{movie}->{imdbID});
    line 25: my $votes = escapeSingleQuote($data->{movie}->{imdbVotes});

  • Mahesh S Rao

    excellent for beginner like me to learn perl. thank you (bit odd commenting a year after ;))

    • http://bobbelderbos.com/ Bob Belderbos

      thanks, it was a nice exercise.

  • http://pierreism.tumblr.com Pierreism

    aw shucks I’m getting this too. Did you manage to find a fix? Or do I have to resort to unix?

    • http://pierreism.tumblr.com Pierreism

      Oh could this be my problem?: http://stackoverflow.com/a/10491909

      “XMLin method can take a file handle, a string, and even an IO::Handle object. What it can’t take is a URL via HTTP”

      It seems like I need a workaround using LWP::Simple; but I’m not sure how I would go about doing that. (Sorry for clogging up your comments!)

      Edit: Nevermind! It was just the syntax of the new omdbapi.com requests that were giving me troubles. Line 14 in getMovieData.pl should read: http://www.omdbapi.com/?r=XML&t=$movie … I think? Still a bit fuzzy on details!

    • http://bobbelderbos.com/ Bob Belderbos

      Hi Pierre, see emails where we discussed this. cheers for checking out my blog

© 2014 Bob Belderbos. All rights reserved.

- If you like something here, link to it instead of copy+paste.
- Disclaimer: ideas expressed on my blog are mine, and have nothing to do with the current/previous employers.
- Proudly using Wordpress and the Insider Theme on Bluehost