-2

Trying to clarify my knowledge on databases and indexes. I'd like to know how exactly it works. So I have a few questions:

  1. When indexing a table over column or set of columns, a new table is created that holds "indexes". What exactly is stored in that table? A pointer to the row related to the index & some integer value calculated from the column values (similar to hashcodes)? I have bolded integer, because I want to know how exactly does an index look like.
  2. Are updating, deleting & inserting the only operations that trigger recalculating an index?
  3. Is only the index related to the row recalculated and the whole table sorted afterwards, or are there more calculations made?
5
  • en.wikipedia.org/wiki/Database_index Commented Feb 22, 2022 at 17:40
  • 2
    One thing that most definitely doesn't happen is that the whole table is sorted. The entire point of indexing is to be able to find specific rows quickly without having to put them into specific places. Commented Feb 22, 2022 at 22:16
  • Something to be clear: Tables are a high level abstraction. The list of rows is not the data structure stored directly on disk, with various other details involved to enable performant data manipulation. Once you understand the difference between the abstraction and the storage format, hopefully that will help with the conceptual issue behind this question. Commented Feb 22, 2022 at 22:27
  • @user1937198, surely there is an equally feasible abstraction that describes indexes? Commented Feb 23, 2022 at 6:56
  • @Steve Indexes are arguably a part of the table abstraction (especially on systems with clustered index support) Commented Feb 23, 2022 at 8:19

1 Answer 1

2

An index isn't a table, it is an index. There are multiple kinds of indexes, some ordered, some not. An index might internally be a hash table, a btree, a balanced tree, or something completely different. An index will not be a sorted linear list, as those are too expensive to update. Tables are not sorted, because there's no point. That's what indexes are for.

What is stored in an index is likely the key used to generate the index and a reference to the row it is from. Tables are not arrays, so indexes probably won't be storing integers to reference the rows. How an index references rows in a table is an implementation detail that is different in every database and will be unique to the way the database stores its tables.

Also, an index is unlikely to store a hash, as the hash can be generated from the key and the hash is only used to find the key in the index. If it is an ordered index, it wouldn't need a hash.

Any operation that affects the fields in a table that are the keys the index uses will cause the index to be updated. Generally, indexes are NOT recalculated -- they are just incrementally updated, and they use data structures that support that. However, there may also be some other index specific and whole table operations (like create index or repair table) that may cause an index to be calculated for every row. Also, some databases support temporarily disabling indexes for mass data inserts, and then recalculate the index once when the operation completes.

9
  • I haven't put much thought into this before now, but is there such a thing as a "disordered index"? Surely the fundamental point of an index is to improve efficiency by maintaining a particular order? Commented Feb 23, 2022 at 7:04
  • @Steve An index implemented as a hash table would not provide a useful order. You would need an ordering for efficient range queries (e.g. SELECT * WHERE foo > 0) but a hash index is still perfectly suitable for indexing a unique or primary key (e.g. SELECT * WHERE id = 123) and consequently also for joins. But in databases, hash tables rarely have advantages over btrees which are sorted. Commented Feb 23, 2022 at 8:02
  • Hash is extremely useful for unique() because it is O(1) where a btree is going to be O(log(N)) at best. Unique doesn't care about sort order. Commented Feb 23, 2022 at 12:26
  • @amon, the hash order is useful for the hash itself, presumably? I don't know the full details of how hashes work, but again I can't see how the index could possibly work effectively if its entries were fundamentally disordered. Commented Feb 23, 2022 at 16:50
  • @Steve: You need to look up hashes somewhere else (like wikipedia) and then maybe you will understand both why an unordered index is useful and why hashes are not useful for an index that needs to be ordered. Commented Feb 24, 2022 at 2:15

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.