On many occasions I have found Tableau data extracts are slow to refresh on the Tableau Server when pulling from SQL (and I’m sure this happens with all data connections). I always used to put this down to the slow execution of the query, which could definitely be a cause as if it’s slow to run on the SQL Server it will be slow to run when refreshing an extract. It also appears that Tableau can be the cause of the slow updating in some cirucmstances. I have just come across this write up on extract performance by one of the Tableau staff which they put on their forums. I think it is very useful so I’ve copied it into this article.
Basically it advises what to check to see what the cause is of the slow updating of extracts and advises what to do if the blame is on the Tableau side.
“When thinking about extracts, the “width” of the final extract (all columns in the extract) is often as, and sometimes more important than the height/number of rows. So keep that in mind for the rest of the “commentary”.
I just pulled ~5M rows from SQL (via a 10-table join) and it took 5 min, 10 sec – If the output isn’t too wide and your database is responsive, ~1M rows / minute is fairly standard.
If you have a particularly “fast” data source, you can do better. For example, when extracting rows from a huge local extract with not too many columns, I was doing about 80M an hour, or ~1.33M / minute. Yay!
Flip this equation – I had a client pulling ~190M rows from a SQL Server via a fairly ugly query – it also returned 100+ columns, some of which contained extremely unique values (which makes for slower sorting). It took these folks about 8 hours to return results from SQL, and then an additional 10+ hours to process/sort because of the super-high cardinality in some of their fields – about 174K rows / min. Boo!
So…I’d say that 11M rows in 2-3 hours is too slow. The things you should look at:
- How long does it take after you start the extract process for SQL Server to start returning records (based on the Tableau dialog that says “X rows retrieved so far”)? This is pure “SQL think time”
- How long does it take to return those 11M rows? This is a combination of SQL Processing time and latency in terms of returning 11M rows across your network
- How long does Tableau take in the “Optimizing” and “Sorting” phases – this is pure Tableau time
If you see the lion’s share of your time being sucked up in bullets 1 & 2, you should see if you can optimize SQL – look at the tables/query you’re using and add appropriate indexes, etc. etc.
If Tableau is taking a long time to optimize/sort, you may have lots of columns and/or columns that contain very unique values which take longer to sort. Consider removing those fields from your data source before you create the extract.”
If the issue is on the SQL side I suggest you get a DBA to optimise your query if you don’t know how to do that yourself. Improved indexing can speed up your query execution speed exponentially. If that fails you could pre-aggregate the data giving Tableau a simple select from a single table.