Quantcast
Channel: Hacker News 50
Viewing all articles
Browse latest Browse all 9433

sql - Why use a database instead of just saving your data to disk? - Programmers Stack Exchange

$
0
0

Comments:"sql - Why use a database instead of just saving your data to disk? - Programmers Stack Exchange"

URL:http://programmers.stackexchange.com/questions/190482/why-use-a-database-instead-of-just-saving-your-data-to-disk


One thing that no-one seems to have mentioned is indexing of records. Your approach is fine at the moment, I assume that you have a very small data set and very few people accessing it.

As you get more complex you're actually creating a database. Whatever you want to call it, a database is just a set of records stored to disk. Whether you're creating the file, or mysql/sqlite/whatever is creating the file(s) they're both databases.

What you're missing is the complex functionality that has been built into the database systems to make them easier to use.

The main thing that springs to mind is indexing. Ok, so you can store 10 or 20 or even 100 or 1000 records in a serialised array, or a json string and pull it out of your file and iterate it relatively quickly.

Now, imagine you have 10,000, 100,000, or even 1,000,000 records. When someone tries to log in you're going to have to open a file which is now several hundred megabytes large, load it into memory in your program, pull out a similarly sized array of information and then iterate 100s of thousands of records just to find the 1 record you want to access.

A proper database will allow you to set up indexes on certain fields in records allowing you to query the database and receive a response very quickly, even with huge data sets. Combine that with something like memcached, or even a home-brew caching system (e.g. store the results of a search in a separate table for 10 minutes and load those results in case someone else searches for the same thing soon afterwards) and you'll have blazing fast queries, something you won't get with such a large dataset when you're manually reading/writing to files.

Edit another thing loosely related to indexing is transfer of information. As i said above, when you've got files of hundreds or thousands of MB you're having to load all of that information into memory, iterate it manually (probably on the same thread) and then manipulate your data.

With a database system it will run on its own thread(s), or even on its own server. All that is transmitted between your program and the database server is a SQL query and all that is transmitted back is the data you want to access. You're not loading the whole dataset into memory - all you're sending and receiving is a tiny fraction of your total data set.


Viewing all articles
Browse latest Browse all 9433

Trending Articles