Scrigroup - Documente si articole


HomeDocumenteUploadResurseAlte limbi doc
AccessAdobe photoshopAlgoritmiAutocadBaze de dateCC sharp
CalculatoareCorel drawDot netExcelFox proFrontpageHardware
HtmlInternetJavaLinuxMatlabMs dosPascal
PhpPower pointRetele calculatoareSqlTutorialsWebdesignWindows

AspAutocadCDot netExcelFox proHtmlJava
LinuxMathcadPhotoshopPhpSqlVisual studioWindowsXml

Tutorial - Indexes Part 2 - When to use them


+ Font mai mare | - Font mai mic

Tutorial - Indexes Part 2 - When to use them  

Since we're getting great speed improvements, why don't we index everything we can? Before we go any further, there are some

issues to be aware of when using indexes.

As you might have noticed from the previous tutorial, when we created the indexes they have been created after the table has been

created and all of the data has been imported. This is for a reason - indexes are very time consuming to create and update.

Each time a row is added or updated, the index needs to be updated with the new information.

Let's create another data import script and check the difference.

Here's the new script:


$count = 1;

$outputfile = 'import2.sql';

@chars = ('A 'Z', 'a 'z', 0 .. 9);

@numbers = (0 .. 9);

@single_chars = ('a' .. 'e');

$totalrecords = 5000000; # 5 million

open(OUTPUT, '> $outputfile');

print OUTPUT 'CREATE TABLE indextest2 (';

print OUTPUT 'id INT, name TEXT, number INT, letter CHAR(1)';

print OUTPUT ');';

print OUTPUT 'COPY indextest2 (id,name,number,letter) FROM stdin;n';

print OUTPUT 'CREATE INDEX index2_id ON indextest2(id);';

while ($count <= $totalrecords) (1 .. 8)]);

$randnum = join('', @numbers [map (1 .. 8)]);

$randletter = join('', @single_chars [map(1)]);

print OUTPUT



print OUTPUT '.n';

print OUTPUT 'VACUUM ANALYZE indextest2;n';

close OUTPUT;


The only difference is we're creating the index before importing the data - just the one line.

Let's import this data and compare the time it takes to the previous tutorial:

[pgsql@elrond pgsql]$ time psql -d indextest < import2.sql




real 10m42.827s

user 0m21.720s

sys 0m4.320s

Just by changing the order of commands and creating the index before any data is imported, we've doubled the time it takes to

import the same amount of records.

As you can see, it is definitely beneficial to create the index after the data has been imported so while the data is importing

the index doesn't get updated or referenced. Smaller tables might not have the same difference in processing time, but very large

tables of data like these the processing time is quite substantial. Of course, this only applies to tables where you are able to

bulk load data like this. For dynamic tables where rows are getting added and removed constantly like for a website, there is no

advantage to taking this approach.

Deleting a row from the table also affects the index especially if you delete a large number of rows, 1 or 2 rows will not affect

the index.

The index is cached in PostgreSQL so it doesn't have to check the number of rows and information each time the index is manipulated or accessed. To reset the cache, we need to VACUUM ANALYZE the table. It basically looks at the table to calculate the number of rows, the distribution of rows and so on for the index to use.

Politica de confidentialitate | Termeni si conditii de utilizare



Vizualizari: 969
Importanta: rank

Comenteaza documentul:

Te rugam sa te autentifici sau sa iti faci cont pentru a putea comenta

Creaza cont nou

Termeni si conditii de utilizare | Contact
© SCRIGROUP 2024 . All rights reserved