How are Secondary Indices really stored ?

This is based on the article from Datastax found here; https://www.datastax.com/blog/2016/04/cassandra-native-secondary-index-deep-dive

Let’s just create a simple table

CREATE TABLE customer (
    id int PRIMARY KEY,
    city text,
    name text
)

CREATE TABLE customer (

id int PRIMARY KEY,

city text,

name text

)

Or visualized as a table :

Column	Type	Key
id	int	Primary Key
city	text
name	text

If we then create an index like this

CREATE INDEX customer_city_idx ON customer (city);

1	CREATE INDEX customer_city_idx ON customer (city);

Then this will result in just “normal” table, just hidden , and here the column we created the index for becomes the Partition Key, and the original table Partition Key becomes the clustering key

Column	Type	Key
city	text	Primary Key
id	int	Clustering Key

With some data it would be like this for the “customer” table.

Id	Name	City
1	Italia Pizzeria	Kalmar
2	Thai Silk	Kalmar
3	Royal Thai	Stockholm
4	Indian Corner	Malmö

And the index which then is a “table” would thus be like this

City	Id
Kalmar	1
Kalmar	2
Stockholm	3
Malmö	4

When a cluster is used, the index then the data of the source table is distributed over the nodes, using the murmor3 algorithm. Now the index table is also distributed, BUT together on the same node with the data of the source table.

tsoft.se

Tobias – With a Passion For Software Development

Monthly Archives: August 2020

Apache Cassandra Secondary Indices

How are Secondary Indices really stored ?