Why use hashing to create pathnames for large collections of files?

Posted by Stephen on Stack Overflow See other posts from Stack Overflow or by Stephen
Published on 2008-12-03T21:56:48Z Indexed on 2010/03/29 14:53 UTC
Read the original article Hit count: 281

Filed under:

data-structures

|

database-design

|

Patterns

Hi, I noticed a number of cases where an application or database stored collections of files/blobs using a has to determine the path and filename. I believe the intended outcome is a situation where the path never gets too deep, or the folders ever get too full - too many files (or folders) in a folder making for slower access.

EDIT: Examples are often Digital libraries or repositories, though the simplest example I can think of (that can be installed in about 30s) is the Zotero document/citation database.

Why do this?

EDIT: thanks Mat for the answer - does this technique of using a hash to create a file path have a name? Is it a pattern? I'd like to read more, but have failed to find anything in the ACM Digital Library

© Stack Overflow or respective owner

Related posts about data-structures

Clever ways of implementing different data structures in C & data structures that should be used mor

as seen on Stack Overflow - Search for 'Stack Overflow'
What are some clever (not ordinary) ways of implementing data structures in C, and what are some data structures that should be used more often? For example, what is the most effective way (generating minimal overhead) to implement a directed and cyclic graph with weighted edges in C? I know that… >>> More
Is there a way to track data structure dependencies from the database, through the tiers, all the way out to a web page?

as seen on Programmers - Search for 'Programmers'
When we design applications, we generally end up with the same tiered sets of data structures: A persistent data structure that is described using DDL and implemented as RDBMS tables and columns. A set of domain objects that consist primarily of data structures, usually combined with business-rule… >>> More
Why are data structures so important in interviews?

as seen on Programmers - Search for 'Programmers'
I am a newbie into the corporate world recently graduated in computers. I am a java/groovy developer. I am a quick learner and I can learn new frameworks, APIs or even programming languages within considerably short amount of time. Albeit that, I must confess that I was not so strong in data structures… >>> More
Thread-safe data structures

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello, I have to design a data structure that is to be used in a multi-threaded environment. The basic API is simple: insert element, remove element, retrieve element, check that element exists. The structure's implementation uses implicit locking to guarantee the atomicity of a single API call.… >>> More
Data Structures

as seen on Stack Overflow - Search for 'Stack Overflow'
There is a large stream of numbers coming in such as 5 6 7 2 3 1 2 3 .. What kind of data structure is suitable for this problem given the constraints that elements must be inserted in descending order and duplicates should be eliminated. I am not looking for any code just ideas? I was thinking… >>> More

Related posts about database-design

(Database Design - products attributes): What is better option for product attribute database design

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I new in database design. What is better option for product attribute database design for cms?(Please suggest other options also). option 1: 1 table products{ id product_name color price attribute_name1 attribute_value1 attribute_name2 attribute_value2 attribute_name3 attribute_value3 } option… >>> More
Book Review: Pro SQL Server 2008 Relational Database Design and Implementation

as seen on SQL Blog - Search for 'SQL Blog'
Investing in proper database design is a very efficient way to cut maintenance costs. If we expect a system to last, we need to make sure it has a good solid foundation - high quality database design. Surely we can and sometimes do cut corners and save on database design to get things done faster… >>> More
Advice on database design / SQL for retrieving data with chronological order

as seen on Stack Overflow - Search for 'Stack Overflow'
I am creating a database that will help keep track of which employees have been on a certain training course. I would like to get some guidance on the best way to design the database. Specifically, each employee must attend the training course each year and my database needs to keep a history of… >>> More
Fiscal year handling strategies in database design

as seen on Stack Overflow - Search for 'Stack Overflow'
By fiscal year I mean all the data in the database (in all tables) that occurred in the particular year. Lets say that we are building an application that allows user to choose from different years. What way of implementing this would you prefer, and why: Separate fiscal year data based on multiple… >>> More
Database design for summarized data

as seen on Stack Overflow - Search for 'Stack Overflow'
I have a new table I'm going to add to a bunch of other summarized data, basically to take some of the load off by calculating weekly avgs. My question is whether I would be better off with one model over the other. One model with days of the week as a column with an additional column for price… >>> More