When you first start learning SQL you may hear the term ‘ad-hoc query’ repeated. An ad-hoc query is a query that is written on the fly. This kind of query is one that you might use from time to time to pull information using your desktop-resident query tool. An ad-hoc query is not predefined in code and performed on a routine database.

Writing Ad Hoc Queries Takes Skill

There are many occasions when writing ad hoc queries may be required. You may have a set of queries predefined in code for your app, that would allow the user of the app to pull out the most commonly requested information. If one of those users suddenly needs a completely different set of information that is beyond the scope fo the tool then they may contact the database manager and ask them to find that information for them.

Let’s face it, most people who use data today are not technically inclined. They have the power of complex databases at their fingertips, but they do not really know what goes into constructing the queries. Writing ad-hoc queries by hand takes skill.

There are some tools out there which allow people to produce ad-hoc queries without knowing what goes on under the hood. End users have the chance to construct complex queries with those tools using a drag and drop interface. It’s hard to beat that simplicity.

Improving Database Performance for a Range of Queries

One of the challenges of ad-hoc queries is that where predefined queries are usually quite well optimized, an ad-hoc one might be far more resource-heavy. To reduce the impact of the extra resource requirements, it helps if the database server is well-provisioned in terms of storage, memory and even processor time. For some data sets, a lot of the demand can be mitigated using pre-calculated result sets, but that is not always an option.

Another way that database administrators can improve the performance of a database is by discouraging end-users from making extensive use of ad-hoc queries unless they are absolutely necessary. It is often far better to use a query that takes data from a pre-defined range instead of polling the entire database. This is particularly true where the database in question contains millions of transactions, for example. It’s unlikely that the user really wants to look through all of those. They could sort by date or some other primary field and the query only the results that are within the range required.

Ad hoc queries can cause significant performance degradation, especially if they are complex, so it is a good idea to use them on a ‘copy’ of the live database, and to take a data mining approach to performing the query, rather than running it on a live database that is more likely to suffer from the slowdown.

With proper design and scheduling, it is possible to give users access to the information that they need without having to worry too much about the database taking a performance hit.