PSYCHED about DQL Next

DQL Next is under design. What we have provided in Domino 10 and 11 is basically a find engine using all available means within Domino of searching document contents. We’re not done there to be sure. But what follows is a vast set of capability never bundled as we are going to bundle it. I’m excited about it, because it will make Domino developers very powerful and efficient.

What I wanted to offer in this post was the set of principles and feature areas we’re advancing. It’s all editable, we may add new things and throw away some of them. I wanted to lay them out nonetheless, so people could see and contribute.

Domino is 36 years old this year. Its survival and continued relevance are due to some good decisions made early on in its evolution. An easy criticism of Domino is that all this is 30 years too late. At HCL, we don’t believe that. We want to keep a good thing going.

Syntax vs. Structure

One of the earliest discussions we had about DQL (it wasn’t even called DQL) was a debate between structured fields fed to an API (see Mongodb) and query syntax. We decided for the latter because we reasoned it would be easier to write and read queries; easier to figure out what they should do or were doing. It was simpler.

So, we built the syntax that we will very soon ship with 11.0.1. Most people have been happy with it.

With the advent of multiple database queries, returning and sorting output, combining query results using joins, named queries, saved queries and all the rest, the simplicity argument starts being an uphill one. Yet, people have used SQL for decades and its syntax could hardly be described as easy to understand.

And that comparison, to SQL, has also offered strong impetus to stay far away from relational constructs, all the while conceding the functional need for them. Domino is a proud NoSQL engine and should therefore have a NoSQL query language. Also, if we were to make syntax the duplicates SQL, people would expect SQL support in Domino even in the small places like GROUP BY and JOIN.

There isn’t time for a critique of SQL and it doesn’t matter anyway. SQL owns the DML and query space in relational database technology. Domino will never be relational, and our users are glad of that (except when they want reports in Excel, say).

The principles of DQL syntax are these. We will:

bundle processing in syntax and avoid calls with parameters.
accomplish all processing using keyword phrases in favor of punctuation when possible.
largely avoid SQL in favor of Domino-centric and -friendly syntax
leverage and empower Domino’s data model and design center in DQL syntax

Multiple databases and named queries

DQL will support data from multiple databases. This is not only for the COMBINES processing (described below) but for federated querying. It’s very common for sets of database to be built and populated using the same or very similar design, where data is kept separate for any number of reasons. There will be a way, like using a new IN clause variant, to process the same query including RETURNed data, against multiple databases.

When this mechanism is used to COMBINE data, its portion of the query (query within the query or sub-query) will need to be named in order to be specific about which fields, view columns and Formula Language widgets are being specified in RETURNS, SORTBY and COMBINE clauses. This will likely be done using a keyword like “AS” proceeding or following the new IN clause variant.

Returning data

Right now, DQL finds documents. Internally, it returns an IDTable, a well-described and fine-performing data structure that is used ubiquitously in core code. It supports one data order – ascending (or descending) by NoteID. And NoteIDs are database-instance relative only. We need to do a lot more.

Our design is to supply a RETURNS clause in DQL that has a list of fieldnames, view columns and Formula Language widget as a comma-list following the keyword. By design, the RETURNS clause would be issued at the end of a query, where its list of return values would reference earlier query parts. And RETURNS lists would be independent of any sorting and results combining, though categorized ordering and aggregation would dictate the returned set of values.

The formats we are targeting are JSON, comma-separated values and NSF Summary (later).

There will also be a DOCUMENTS keyword, say, which means NoteID || UNID combinations only.

Sorting data

Sorted results is the most frequent request we have of DQL today. We will provide it via a list of fieldnames, view columns and Formula Language widgets much like the RETURNS clause arguments. The keyword will likely be SortBy, to avoid ORDER BY and all its mandated processing. We will offer DESCENDING and CATEGORIZED for each sorted field in addition to the default ASCENDING.

COMBINING data

Joins have been emphasized as missing Domino functionality so long that join processing could be thought of as the only way to combine results across multiple databases. Of COURSE we have to support fielda = fieldb joining, but Domino is a document store NoSQL database, so the overall operation will be called a COMBINE. In addition to joins, we intend to support

Federated results – the same set of RETURNed fields from multiple databases
CONTAINS joins – lookup scalar values in one field using FTSearch in another field or entire set of documents
Links – document link to one another. Leverage those links in COMBINEd results

INNER and OUTER joins represent functionality whose support for which we need to gauge the need.

Saved results

For long-running queries and to provide efficient shorthand in DQL syntax, saved results will be offered. Runtime processing will be extremely fast. They will initially be named IDTables residing in the databases specified, replicable and refreshable. Later they will be composite views from multiple databases. These results consume disk resources and will be thought of as transient though you may want to keep some around. Details on their management is TBD.

Domino services

Most of the required and designed DQL functionality has long been part of Domino core. I describe the proportion of truly published capability (via APIs) to be in the 60-70% range, so we’re increasing that dramatically here.

I am doing my job when I make Domino application developers powerful. With power comes responsibility, which will be described in volumes of small print and controls to come.

Now .. more goodies

I feel a bit like a radio disk jockey here, but I want to offer a sneak preview of more details of this design to a set of folks – stakeholders and business partners. In that spirit, I will send invitations to the first n people who ask for inclusion via Twitter message or e-mail (yes some have an advantage there). Please include YOUR e-mail address.

We need your feedback to get this right. It’s extremely exciting but you know the details of your work with Domino, the contours of your business, operational considerations, costs and expertise. So please be active in aha, forums and response to blogs like this one.

Share this:

Related

Leave a comment Cancel reply