Why are partial PostgreSQL HASH indices not smaller than full indices?Why does MySQL not have hash indices on...

If nine coins are tossed, what is the probability that the number of heads is even?

Increase the space between numerator and denominator

Averaging over columns while ignoring zero entries

Inorganic chemistry handbook with reaction lists

What does "rhumatis" mean?

Use Mercury as quenching liquid for swords?

How does learning spells work when leveling a multiclass character?

What does it take to become a wilderness skills guide as a business?

After Brexit, will the EU recognize British passports that are valid for more than ten years?

How to install "rounded" brake pads

I am the light that shines in the dark

Why aren't there more Gauls like Obelix?

What is the oldest European royal house?

Are small insurances worth it?

How to educate team mate to take screenshots for bugs with out unwanted stuff

Can inspiration allow the Rogue to make a Sneak Attack?

How to negotiate a patent idea for a raise?

“I had a flat in the centre of town, but I didn’t like living there, so …”

Why do we call complex numbers “numbers” but we don’t consider 2-vectors numbers?

What is the url of this routes.xml?

Other authors are notified except me, good or bad sign?

Precision notation for voltmeters

Why restrict private health insurance?

Ultrafilters as a double dual



Why are partial PostgreSQL HASH indices not smaller than full indices?


Why does MySQL not have hash indices on MyISAM or InnoDB?Adding index to large mysql tablesWhy would function based indices I’ve created lower the cost but not show up in the explain plan breakdown?Optimizing indexesPostgreSQL indices on multi fields queriesPostgreSQL not using partial index when using boolean in WHERE clause512 Bytes are not being used from SQL Server's 8 KByte data pageIndexing strategy for VARCHAR2 LIKE searchHow is it possible for Hash Index not to be faster than Btree for equality lookups?PostgreSQL suitabilty of Hash Index on PK and FK













2















I want to create the most efficient index for a sparsely populated column. I only need equality operations, so a HASH index should be beneficial.



Now I'm wondering why a partial HASH index isn't smaller than a full hash index:



CREATE INDEX full_hash    ON mytable USING HASH(my_id); # 256 MB
CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT NULL; # 256 MB

CREATE INDEX full_btree ON mytable (my_id); # 537 MB
CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 32 MB


Both hash indices take exactly the same amount of space (as shown in pgHero). However, when using standard BTREE indices, the partial index takes only 5% of the space of the full index.



Are partial HASH indices not supported in PostgreSQL 10?










share|improve this question



























    2















    I want to create the most efficient index for a sparsely populated column. I only need equality operations, so a HASH index should be beneficial.



    Now I'm wondering why a partial HASH index isn't smaller than a full hash index:



    CREATE INDEX full_hash    ON mytable USING HASH(my_id); # 256 MB
    CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT NULL; # 256 MB

    CREATE INDEX full_btree ON mytable (my_id); # 537 MB
    CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 32 MB


    Both hash indices take exactly the same amount of space (as shown in pgHero). However, when using standard BTREE indices, the partial index takes only 5% of the space of the full index.



    Are partial HASH indices not supported in PostgreSQL 10?










    share|improve this question

























      2












      2








      2








      I want to create the most efficient index for a sparsely populated column. I only need equality operations, so a HASH index should be beneficial.



      Now I'm wondering why a partial HASH index isn't smaller than a full hash index:



      CREATE INDEX full_hash    ON mytable USING HASH(my_id); # 256 MB
      CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT NULL; # 256 MB

      CREATE INDEX full_btree ON mytable (my_id); # 537 MB
      CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 32 MB


      Both hash indices take exactly the same amount of space (as shown in pgHero). However, when using standard BTREE indices, the partial index takes only 5% of the space of the full index.



      Are partial HASH indices not supported in PostgreSQL 10?










      share|improve this question














      I want to create the most efficient index for a sparsely populated column. I only need equality operations, so a HASH index should be beneficial.



      Now I'm wondering why a partial HASH index isn't smaller than a full hash index:



      CREATE INDEX full_hash    ON mytable USING HASH(my_id); # 256 MB
      CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT NULL; # 256 MB

      CREATE INDEX full_btree ON mytable (my_id); # 537 MB
      CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 32 MB


      Both hash indices take exactly the same amount of space (as shown in pgHero). However, when using standard BTREE indices, the partial index takes only 5% of the space of the full index.



      Are partial HASH indices not supported in PostgreSQL 10?







      postgresql index index-tuning postgresql-10






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 7 hours ago









      Ortwin GentzOrtwin Gentz

      1224




      1224






















          2 Answers
          2






          active

          oldest

          votes


















          3














          It's not explicitly stated in the documentation, but in the source code there is the following comment:



          /*
          * We do not insert null values into hash indexes. This is okay because
          * the only supported search operator is '=', and we assume it is strict.
          */


          So the is not null predicate does indeed change nothing, as null values are always ignored for hash indexes (which does make sense, as comparing null values with = would never return true).






          share|improve this answer





















          • 1





            Interesting. So apparently, hash indexes aren't appropriate for sparsely populated columns. I tested with a column even less populated (only a few 100 records out of >10 m total) and the index took 256 MB as well. So it looks like the space of a hash index only depends on table size, not on the number of different indexable values.

            – Ortwin Gentz
            6 hours ago











          • This explains why the two HASH indexes are the same size as each other, but not why they are so large compared to the btree indexes.

            – jjanes
            5 hours ago











          • The full btree index is more than double the size of the hash index.

            – Ortwin Gentz
            16 mins ago



















          3














          I would argue that this is a bug in the hash index code. When you create an index on an already-populated table, it tries to pre-size the index to hold all the data so that it doesn't have to keep splitting buckets as the index is created. But the code for doing this does not take the NULL fraction of the column nor (apparently) the selectivity of the partial index clause into account, so it arrives at a too-large number for the pre-sizing.



          If you were to create the index first, and then populated the table, you will find that the hash index is small, whether you made it partial or not. If the table is going to grow substantially after the index is created, the extra space consumed by the index upon original creation will be put to good use.






          share|improve this answer























            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "182"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f231647%2fwhy-are-partial-postgresql-hash-indices-not-smaller-than-full-indices%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            It's not explicitly stated in the documentation, but in the source code there is the following comment:



            /*
            * We do not insert null values into hash indexes. This is okay because
            * the only supported search operator is '=', and we assume it is strict.
            */


            So the is not null predicate does indeed change nothing, as null values are always ignored for hash indexes (which does make sense, as comparing null values with = would never return true).






            share|improve this answer





















            • 1





              Interesting. So apparently, hash indexes aren't appropriate for sparsely populated columns. I tested with a column even less populated (only a few 100 records out of >10 m total) and the index took 256 MB as well. So it looks like the space of a hash index only depends on table size, not on the number of different indexable values.

              – Ortwin Gentz
              6 hours ago











            • This explains why the two HASH indexes are the same size as each other, but not why they are so large compared to the btree indexes.

              – jjanes
              5 hours ago











            • The full btree index is more than double the size of the hash index.

              – Ortwin Gentz
              16 mins ago
















            3














            It's not explicitly stated in the documentation, but in the source code there is the following comment:



            /*
            * We do not insert null values into hash indexes. This is okay because
            * the only supported search operator is '=', and we assume it is strict.
            */


            So the is not null predicate does indeed change nothing, as null values are always ignored for hash indexes (which does make sense, as comparing null values with = would never return true).






            share|improve this answer





















            • 1





              Interesting. So apparently, hash indexes aren't appropriate for sparsely populated columns. I tested with a column even less populated (only a few 100 records out of >10 m total) and the index took 256 MB as well. So it looks like the space of a hash index only depends on table size, not on the number of different indexable values.

              – Ortwin Gentz
              6 hours ago











            • This explains why the two HASH indexes are the same size as each other, but not why they are so large compared to the btree indexes.

              – jjanes
              5 hours ago











            • The full btree index is more than double the size of the hash index.

              – Ortwin Gentz
              16 mins ago














            3












            3








            3







            It's not explicitly stated in the documentation, but in the source code there is the following comment:



            /*
            * We do not insert null values into hash indexes. This is okay because
            * the only supported search operator is '=', and we assume it is strict.
            */


            So the is not null predicate does indeed change nothing, as null values are always ignored for hash indexes (which does make sense, as comparing null values with = would never return true).






            share|improve this answer















            It's not explicitly stated in the documentation, but in the source code there is the following comment:



            /*
            * We do not insert null values into hash indexes. This is okay because
            * the only supported search operator is '=', and we assume it is strict.
            */


            So the is not null predicate does indeed change nothing, as null values are always ignored for hash indexes (which does make sense, as comparing null values with = would never return true).







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 6 hours ago

























            answered 7 hours ago









            a_horse_with_no_namea_horse_with_no_name

            40.5k777113




            40.5k777113








            • 1





              Interesting. So apparently, hash indexes aren't appropriate for sparsely populated columns. I tested with a column even less populated (only a few 100 records out of >10 m total) and the index took 256 MB as well. So it looks like the space of a hash index only depends on table size, not on the number of different indexable values.

              – Ortwin Gentz
              6 hours ago











            • This explains why the two HASH indexes are the same size as each other, but not why they are so large compared to the btree indexes.

              – jjanes
              5 hours ago











            • The full btree index is more than double the size of the hash index.

              – Ortwin Gentz
              16 mins ago














            • 1





              Interesting. So apparently, hash indexes aren't appropriate for sparsely populated columns. I tested with a column even less populated (only a few 100 records out of >10 m total) and the index took 256 MB as well. So it looks like the space of a hash index only depends on table size, not on the number of different indexable values.

              – Ortwin Gentz
              6 hours ago











            • This explains why the two HASH indexes are the same size as each other, but not why they are so large compared to the btree indexes.

              – jjanes
              5 hours ago











            • The full btree index is more than double the size of the hash index.

              – Ortwin Gentz
              16 mins ago








            1




            1





            Interesting. So apparently, hash indexes aren't appropriate for sparsely populated columns. I tested with a column even less populated (only a few 100 records out of >10 m total) and the index took 256 MB as well. So it looks like the space of a hash index only depends on table size, not on the number of different indexable values.

            – Ortwin Gentz
            6 hours ago





            Interesting. So apparently, hash indexes aren't appropriate for sparsely populated columns. I tested with a column even less populated (only a few 100 records out of >10 m total) and the index took 256 MB as well. So it looks like the space of a hash index only depends on table size, not on the number of different indexable values.

            – Ortwin Gentz
            6 hours ago













            This explains why the two HASH indexes are the same size as each other, but not why they are so large compared to the btree indexes.

            – jjanes
            5 hours ago





            This explains why the two HASH indexes are the same size as each other, but not why they are so large compared to the btree indexes.

            – jjanes
            5 hours ago













            The full btree index is more than double the size of the hash index.

            – Ortwin Gentz
            16 mins ago





            The full btree index is more than double the size of the hash index.

            – Ortwin Gentz
            16 mins ago













            3














            I would argue that this is a bug in the hash index code. When you create an index on an already-populated table, it tries to pre-size the index to hold all the data so that it doesn't have to keep splitting buckets as the index is created. But the code for doing this does not take the NULL fraction of the column nor (apparently) the selectivity of the partial index clause into account, so it arrives at a too-large number for the pre-sizing.



            If you were to create the index first, and then populated the table, you will find that the hash index is small, whether you made it partial or not. If the table is going to grow substantially after the index is created, the extra space consumed by the index upon original creation will be put to good use.






            share|improve this answer




























              3














              I would argue that this is a bug in the hash index code. When you create an index on an already-populated table, it tries to pre-size the index to hold all the data so that it doesn't have to keep splitting buckets as the index is created. But the code for doing this does not take the NULL fraction of the column nor (apparently) the selectivity of the partial index clause into account, so it arrives at a too-large number for the pre-sizing.



              If you were to create the index first, and then populated the table, you will find that the hash index is small, whether you made it partial or not. If the table is going to grow substantially after the index is created, the extra space consumed by the index upon original creation will be put to good use.






              share|improve this answer


























                3












                3








                3







                I would argue that this is a bug in the hash index code. When you create an index on an already-populated table, it tries to pre-size the index to hold all the data so that it doesn't have to keep splitting buckets as the index is created. But the code for doing this does not take the NULL fraction of the column nor (apparently) the selectivity of the partial index clause into account, so it arrives at a too-large number for the pre-sizing.



                If you were to create the index first, and then populated the table, you will find that the hash index is small, whether you made it partial or not. If the table is going to grow substantially after the index is created, the extra space consumed by the index upon original creation will be put to good use.






                share|improve this answer













                I would argue that this is a bug in the hash index code. When you create an index on an already-populated table, it tries to pre-size the index to hold all the data so that it doesn't have to keep splitting buckets as the index is created. But the code for doing this does not take the NULL fraction of the column nor (apparently) the selectivity of the partial index clause into account, so it arrives at a too-large number for the pre-sizing.



                If you were to create the index first, and then populated the table, you will find that the hash index is small, whether you made it partial or not. If the table is going to grow substantially after the index is created, the extra space consumed by the index upon original creation will be put to good use.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 5 hours ago









                jjanesjjanes

                13.6k917




                13.6k917






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Database Administrators Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f231647%2fwhy-are-partial-postgresql-hash-indices-not-smaller-than-full-indices%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Installing LyX: “No textclass is found.”LyX installation error- text class not found- 'Reconfigure' or...

                    (1602) Indiana Índice Designación y nombre Características orbitales Véase...

                    Universidad Autónoma de Occidente Índice Historia Campus Facultades Programas Académicos Medios de...