In this post I will show you how you can easily import IMDb data for your movies and process the XML to get SQL for database import. From there you can start to build your movie site.
In this post I use a small DVD collection. The movie data (director, actors, year published, etc.) are from IMDb and thanks to IMDb API (Brian Fritz), you get the data easily via curl. At the end of this post you can see a 5 min video demo of the whole process.
How it works
- First I create a database and insert the movie collection table
- The script getMovieData.pl serves to import the movie data from the API with curl and converts the XML to SQL. The escapeSingleQuote function allows the output to contain single quotes.
- The batch.pl script is a little wrapper to run the getMovieData.pl on each title of the movie list you provide as input file. So for the example list you get 41 INSERT statements.
In 3 simple steps you get a complete set of data for each movie title, in a standardized format. Importing this data in a database allows for easy app development. You can quickly lookup what movies you have of your favorite actor or director, what movies were released in 2009, which movies were highest rated or had the most votes, etc. Having the data in a database, makes life easier :)
An example in PHP
Comments and suggestions regarding this post on hacker news. There are some improved versions of the perl script, for example here. I appreciate your feedback to improve my Perl skills. What the IMDB TOS is concerned: this is for personal use only.