CODECUBE VENTURES

DMOZ Data Dump

For a new project I'm working on, I had to download the entire contents of the Open Directory project

The content file clocks in at a healthy 1.22 gig size ... Now I'm going to have to write something to parse this ridiculously huge XML file and bring it into a SQL database. I'm going to guess that putting it into an XMLDocument is probably going to be a bad idea ... I'll probably have to use an XmlReader since that just streams through the file. The only problem with that is that then I don't know (from the reader's perspective) how big the file is and it'll be hard to show some sort of progress bar.

I'm sure I'll figure something out.

Latest post: Digging Up the First Version of CodeCube

See more in the archives