Testing Oracle's Select AI Capabilities

Oracle Cloud Infrastructure (OCI) provides a robust platform for deploying and managing applications. In this blog, we test the capabilities of Select AI with a sample application using Oracle's HR schema.

I've seen several presentations on how Select AI works at various APEX conferences. However, they were very simple, mostly formulating queries on a table (in English) and getting answers (in SQL) using some kind of language model.

Basically, I am curious about three things:

  1. How Select AI handles complex queries?

  2. How different language models (Cohere, OpenAI) deal with complex queries?

  3. How to integrate Select AI with an on premise application and what results are obtained with metadata and queries written in Hungarian?

In this article I will explore the first point using OpenAI's language model. For now, Select AI only works within OCI, so I have created an app with the necessary settings where I can ask questions.

In the schema, I populated the HR schema tables using APEX Sample Datasets.

The HR schema data model looks like this:

Start testing Select AI

Let's start with a simple question:

"Show me the details of the employee with the highest salary."

This solution may not be the most beautiful, so you can ask it to use an analytic function, for example.

"Show me the details of the employee with the highest salary. Use an analytic function in the solution."

SELECT e.employee_id, e.first_name, e.last_name, e.salary
    FROM (
    SELECT employee_id, first_name, last_name, salary,
    RANK() OVER (ORDER BY salary DESC) AS salary_rank
    FROM employees) e
WHERE e.salary_rank = 1

Since I would like to look at the table connections, let's ask what the salaries look like by departments:

"List the amount of salary per department!"

SELECT d.DEPARTMENT_NAME, SUM(e.SALARY) AS TOTAL_SALARY
FROM ENROLOCI.DEPARTMENTS d
    JOIN ENROLOCI.EMPLOYEES e ON d.DEPARTMENT_ID = e.DEPARTMENT_ID
GROUP BY d.DEPARTMENT_NAME

Let's go through the data model up to the highest aggregation, the regions.

"List the amount of salary per location!"

Error.

"List the amount of salary per department and location!"

SELECT d.DEPARTMENT_NAME AS "Department", l.CITY AS "Location", 
        SUM(e.SALARY) AS "Total Salary"
FROM ENROLOCI.DEPARTMENTS d
    JOIN ENROLOCI.LOCATIONS l ON d.LOCATION_ID = l.LOCATION_ID
    JOIN ENROLOCI.EMPLOYEES e ON d.DEPARTMENT_ID = e.DEPARTMENT_ID
GROUP BY d.DEPARTMENT_NAME, l.CITY

After referring in the question to the table between location and salary, the department, we managed to get a good answer to the question.

"List the salaries by country!"

Error.

"List the salaries by country! Use foreign keys!"

Error.

(This model's maximum context length is 4097 tokens. However, you requested 4237 tokens (3213 in the messages, 1024 in the completion). Please reduce the length of the messages or completion.)

Let's try another way.

List the sum of employees' salary by country! Use foreign keys!

SELECT COUNTRY.COUNTRY_NAME, SUM(EMPLOYEES.SALARY) AS TOTAL_SALARY
FROM ENROLOCI.COUNTRIES COUNTRY
    JOIN ENROLOCI.LOCATIONS LOCATION ON COUNTRY.COUNTRY_ID = LOCATION.COUNTRY_ID
    JOIN ENROLOCI.DEPARTMENTS DEPARTMENT ON LOCATION.LOCATION_ID = DEPARTMENT.LOCATION_ID
    JOIN ENROLOCI.EMPLOYEES EMPLOYEES ON DEPARTMENT.DEPARTMENT_ID = EMPLOYEES.DEPARTMENT_ID
GROUP BY COUNTRY.COUNTRY_NAME

Finally, there was no need to list the tables between salary and country. However, I had to specify that the salary belonged to the workers' table and instructed them to use the foreign keys.

"List the sum of employees' salaries by region!"

Interestingly, I didn't have to ask to use foreign keys here, I got the correct result anyway. I assume the AI has learned from your mistakes.

Conclusion

To be fair, the HR schema tables are very well populated with comments.

Let's look at the employees and departments tables for example:

comment on table EMPLOYEES
  is 'employees table. Contains 107 rows. References with departments, 
jobs, job_history tables. Contains a self reference.';
comment on table DEPARTMENTS
  is 'Departments table that shows details of departments where 
      employees work. Contains 27 rows references with locations, 
      employees, and job_history tables.';

The comment in the salary field is as follows:

comment on column EMPLOYEES.salary
  is 'Monthly salary of the employee. Must be greater than zero 
      (enforced by constraint oehr_emp_salary_min)';

Of course, the number of questions is endless, with countless possibilities for testing. However, I think Select AI is a good solution to use, although the metadata descriptions need a lot of testing and tuning.

At the same time, I see Select AI as very promising in its ability to evolve.