Monday, May 6, 2019

Three Strategies For Improving Data Retrieval

Vyasadeva_teaching “Originally there was only one Veda, and there was no necessity of reading it. People were so intelligent and had such sharp memories that by once hearing from the lips of the spiritual master they would understand. They would immediately grasp the whole purport. But 5,000 years ago Vyasadeva put the Vedas in writing for the people in this age, Kali-yuga. He knew that eventually the people would be short-lived, their memories would be very poor and their intelligence would not be very sharp.” (Shrila Prabhupada, Shri Ishopanishad, Introduction)

Download this episode (right click and save)

You want to study the natural world. That is your initial inclination when trying to make sense of anything. Before accepting information from someone’s word, prior to believing one hundred percent in a particular philosophy, better to study what you can.

The main issue is that there is just so much information. Even if focusing on only one subject area, like science, there are volumes and volumes of published literature. There is so much scientific evidence to analyze, and not all of the experiments are still relevant to today. New researchers come along who disprove previously accepted assertions.

In conjunction with modern technology, one method to ease the bottleneck with data retrieval is to create a relational database management system, or RDBMS. More than just tables on a spreadsheet, there is some intelligence within the design. Data enters the system in a certain way. Extraction occurs through syntactically correct programming statements known as queries.

RDBMS_options Within the RDBMS there are certain strategies to help find data. Nothing is perfect, and every layer adds further complexity to the import process, but at least there is some way to ask questions and get answers fast.

1. Clustered index

For all intents and purposes, this is the table. Create a clustered index and you are organizing the actual data of the table. A balanced tree, or b-tree, creates a system for accepting requests and quickly locating the data to be retrieved. Without this index, the table is known as a heap, which has one advantage in the increase in speed for large inserts.

2. Non-clustered index

Since the clustered index organizes the table, with the actual data at the leaf level of the tree, there can only be one per table. Non-clustered indexes are built upon the clustered, and so there can be many. We can think of it like the difference between the page numbers in a book and the index at the back. The page numbers deal with the actual structure of the book, while the index at the back ultimately references the page numbers.

There is the added cost in this technique of a key lookup. If I find what I am looking for in the non-clustered index, I typically have to use the clustering key, which is one or more columns defining the clustered index, to look up the other columns of data I need.

3. Statistics

These aren’t as involved as indexes, since they typically take only one page of data in the RDBMS. The benefit here is that there is information on the indexes. The statistic tied to a particular index tries to group the organized data into equally distributed buckets. This helps in optimizing the query plan.

If I have a non-clustered index on a date column, for instance, the statistic should know roughly how many records are within a particular date range. This way when the query request arrives, the optimizer can know whether to seek in a particular index or not. It may make sense to just scan the entire table if a lot of data has to be returned.

…

There is no free lunch in such a system. New data does not always arrive in a numerically increasing fashion. This means that the indexes have to constantly be rebuilt, which brings an I/O overhead. The statistics have to be accurate; otherwise the query plans will suffer.

Even with the help of modern technology, it is impossible to study every recorded observation made about the world throughout the course of known history. This is the ascending process of knowledge, and it is imperfect. The reason is that no one can ever know everything. Even if they had all the data stored, there would be issues in retrieval and subsequent analysis.

One would think students of Vedic literature would have a similar problem. Not just one book. Not just one teacher. Not just one time period. The Vedas are ever-expanding since they continue to glorify the origin of everything. He is both the beginning and without one. He is the final word and also beyond any identified end to time.

There are many works describing God the person and notable saintly people devoted to Him. Teachers like Vyasadeva could recite these works from memory. They did not need an expensive server managing indexes in order to retrieve a specific verse describing an important aspect of life. They could remember everything and repeat according to the context.

Vyasadeva_teaching The present age of Kali negatively influences memory power. Man can’t remember as well as they used to. Proof is there in analyzing only a short duration of time, such as fifty years. Without computers and calculators, students were taught to remember much more important information. Now there is seemingly no need, since anything can be looked up in a matter of seconds on a smartphone device.

Fortunately, the entire published database of Vedic literature does not have to be held in memory. Knowing a few simple concepts is enough to reach perfection in life. In a single verse, Shri Krishna explains.

भोक्तारं यज्ञ-तपसां
सर्व-लोक-महेश्वरम्
सुहृदं सर्व-भूतानां
ज्ञात्वा मां शान्तिम् ऋच्छति

bhoktāraṁ yajña-tapasāṁ
sarva-loka-maheśvaram
suhṛdaṁ sarva-bhūtānāṁ
jñātvā māṁ śāntim ṛcchati
“The sages, knowing Me as the ultimate purpose of all sacrifices and austerities, the Supreme Lord of all planets and demigods and the benefactor and well-wisher of all living entities, attain peace from the pangs of material miseries.” (Lord Krishna, Bhagavad-gita, 5.29)

He is the enjoyer of every effort in sacrifice. He is the Supreme Lord over the entire universe and every empowered being. He is also the best friend of every living being. Such a well-wisher is the one to impress, who is easily won over through acts of devotion, such as chanting the holy names: Hare Krishna Hare Krishna, Krishna Krishna, Hare Hare, Hare Rama Hare Rama, Rama Rama, Hare Hare.

In Closing:

Strategy for data to retrieve,

But at cost when to receive.

Indexing for system maintaining,

But not perfect for all sustaining.

Since universe always expanding,

But Vyasa and others commanding.

With verses to memory applied,

For future in books supplied.

Monday, May 6, 2019

Three Strategies For Improving Data Retrieval

1. Clustered index

2. Non-clustered index

3. Statistics

No comments:

Post a Comment

Monday, May 6, 2019

Three Strategies For Improving Data Retrieval

1. Clustered index

2. Non-clustered index

3. Statistics

No comments:

Post a Comment

Thank you!