Performance Tuning for PostgreSQL

Howdy!

I’m embarking on adjusting our config and twiddling other bits to
maximize the DB’s speed while running RT. I thought I’d share what I’ve
learned so far along with a few questions and an invitation of
suggestions.

What I’ve learned is fairly limited; I’ve read Bruce Momjian’s article
in the August 2001 Linux Journal
(http://www.linuxjournal.com/article.php?sid=4791) which outlined a bit
of how to think about Shared Buffer Caches, Sort Memory Batch Size, and
Disk/Filesystem location.

What I took from that was that optimally your buffer caches should be
large enough to contain the static tables from all DBs on the server as
well as the expected number of active rows in dynamic tables. I’m
currently coming up with what that size is by reading the table setups
in SQL and doing the math by hand; I’ll post findings soon.

If anyone has a faster way to ask a PostgreSQL DB “How big is that table
in bytes?” and “What’s the maximum size of a row in this table in
bytes?” I’d really love to know it. =]

Meanwhile, I found that we got an astounding speedup from doing
"vacuumdb -z -v -e" which executes the VACUUM ANALYZE command. I hadn’t
done this sooner because of my misunderstanding of what the command
would do; I read its explanation as “Will remove deleted rows” and, as
RT never deletes rows, I didn’t expect it to help.

It helped. A LOT. It was taking ~12 seconds to show me the Home page;
immediately afterwards, that has cut down to 2-3 seconds. Can anyone
explain that one to me? I’d love to fully understand.

My suspicion is that the Analyze feature is helping considerably, too.
According to its man page, it’s supposed to feed information to the
PostgreSQL Optimizer to help the DB better arrange itself for queries. I
suspect that it had never been run after setting up the schema, so the
DB was going with no hints. Can anyone confirm or deny this? If it turns
out that this is correct, I’d highly suggest that “VACUUM ANALYZE” be
run immediately after setting up a new PG schema.

Well, that’s all I know so far. More as I’m able to learn it. =]

Cheers!

–j
Jim Meyer, Geek At Large purp@wildbrain.com

“JM” == Jim Meyer purp@wildbrain.com writes:

JM> Meanwhile, I found that we got an astounding speedup from doing
JM> “vacuumdb -z -v -e” which executes the VACUUM ANALYZE command. I hadn’t
JM> done this sooner because of my misunderstanding of what the command
JM> would do; I read its explanation as “Will remove deleted rows” and, as
JM> RT never deletes rows, I didn’t expect it to help.

In Postgres, “deleted” rows are also the rows left over after any kind
of update to a tuple. This is part of the multi-version concurrency
feature (ie, two transactions can be running, and each sees exactly
what was there at the start of its transaction no matter what else is
changed). Thus, you need to regularly vacuum your database to clear
out these dead rows, and allow them to be re-used.

Adding in the “analyze” is helpful if the distribution of your index
keys changes, but it is not expensive to throw it in as part of your
daily (or more frequent, depending on usage) vacuum. Running it once
after creating the tables is of little use. You need to analyze your
actual data.

For my usage, a daily vacuum of RT is sufficient. For other
databases, there are some tables I vacuum every 6 hours and some every
4 hours. This keeps the on-disk image from getting too big since the
dead rows are not used again until they’ve been vacuumed.

Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: khera@kciLink.com Rockville, MD +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

Jim Meyer purp@wildbrain.com writes:

I suspect that it had never been run after setting up the schema, so
the DB was going with no hints. Can anyone confirm or deny this? If
it turns out that this is correct, I’d highly suggest that “VACUUM
ANALYZE” be run immediately after setting up a new PG schema.

No, you need to do periodic ANALYZEs to update statistics about the
tables. It’s no good when there’s no data in the tables, you get
b0rken statistics.

Please consult the performance tuning documentation for PostgreSQL at
http://postgresql.org/users-lounge/docs/7.2/postgres/performance-tips.html,
which include a whole section about the planners statistics usage and
the care and feeding thereof.

Mike.