How are Secondary Indices really stored ?
This is based on the article from Datastax found here; https://www.datastax.com/blog/2016/04/cassandra-native-secondary-index-deep-dive
Let’s just create a simple table
1 2 3 4 5 |
CREATE TABLE customer ( id int PRIMARY KEY, city text, name text ) |
Or visualized as a table :
Column | Type | Key |
id | int | Primary Key |
city | text | |
name | text |
If we then create an index like this
1 |
CREATE INDEX customer_city_idx ON customer (city); |
Then this will result in just “normal” table, just hidden , and here the column we created the index for becomes the Partition Key, and the original table Partition Key becomes the clustering key
Column | Type | Key |
city | text | Primary Key |
id | int | Clustering Key |
With some data it would be like this for the “customer” table.
Id | Name | City |
1 | Italia Pizzeria | Kalmar |
2 | Thai Silk | Kalmar |
3 | Royal Thai | Stockholm |
4 | Indian Corner | Malmö |
And the index which then is a “table” would thus be like this
City | Id |
Kalmar | 1 |
Kalmar | 2 |
Stockholm | 3 |
Malmö | 4 |
When a cluster is used, the index then the data of the source table is distributed over the nodes, using the murmor3 algorithm. Now the index table is also distributed, BUT together on the same node with the data of the source table.
Thanks for sharing excellent informations. Your web site is so cool. I am impressed by the details that you’ve been on this blog. Bookmarked this web page, will come back for extra articles. I found simply the information I already searched everywhere and simply could not come across. What an ideal web site.
I am glad i got to find your THIS site. I have been examining out a few of your articles and its pretty stuff to read. I will surely bookmark your blog to make sure I could get an up to date post.
regards
good infonya ser
Visit Us