top of page

Automated Model Accuracy And Data Quality Testing Using Dataiku

Author: Rohan Prabhakar & Vipul Tripathi

Overview:

The lifecycle of a data or AI project doesn't end once a Flow is finished. To keep up with our work processes and work on our models, we should constantly take care of them with new data. Dataiku permits us to do this all the more effectively by lessening how much manual oversight we need to have on our data.

Notwithstanding, as we mechanize work processes, we are presented with certain dangers, for example, ingesting low-quality information without knowing it, which could influence datasets, models, and dashboards.

For example, our work process could become broken as an additional section is added to an information dataset; on the other hand, our model could quit catching the example in the information and become out of date.

While computerization vows to save time, it additionally makes the need to execute key measurements and checks, with the goal that our model doesn't break and stays pertinent.

Defining Metrics

Metrics are metadata used to take measurements on the following Flow items:

  • datasets,

  • managed folders, and

  • saved models

They permit us to screen the development of a Dataiku DSS item. For instance, we could figure:

  • the number of missing values in the dataset,

  • the size of a folder, or

  • the precision of a model.

Metrics can likewise be set on divided objects and be figured on a per-partition basis.

Defining Checks

Metrics are often used in combination with checks to verify their evolution.


Checks return one of the four following announcements after each run:

  • EMPTY in the event that the measurements esteem hasn't been figured;

  • ERROR in the event that the check condition has not been regarded.

  • WARNING in the event that the check falls flat in a "delicate" condition, however not a "hard" one;

  • OK, in the event that the check shouldn't raise any worry.

Model Metrics And Checks

We can construct metrics and checks for more Dataiku DSS objects than only datasets. Models are another thing that needs to be closely watched.

The deployed model that forecasts whether a credit card transaction will be authorized or not is represented by the green diamond in the flow.

From the Actions panel, select it and choose to Retrain it (non-recursively). We may view the current and older versions when we access the deployed model. An important performance parameter for classification models is the ROC AUC (or AUC), which is typically about 0.75.

This metric can be tracked, along with many other widely used measures of model performance.

Go to the deployed model object's Metrics & Status tab.

Click on Display on the View subtab to display the various built-in model metrics.

AUC should be added and saved to the display metrics list. There are also additional performance indicators like recall, accuracy, and precision accessible.

Let's now build a check to keep track of this measure.

  • Find the Settings tab by navigating.

  • Add a new check for Metric value in a numeric range under the Status checks subtab.

  • AUC >= 0.95 with a name of 0.60.

  • Choose AUC as the metric to examine.

  • To throw an error if the model performance has either declined or increased suspiciously, set a minimum value of 0.6 and a maximum value of 0.95.

  • Create a soft minimum and soft maximum of 0.65 and 0.9, respectively, to alert us when the performance of the deployed model has improved or worsened.

  • Run the test to ensure that it is functioning. Save it.

Insert A Trigger

In Dataiku, we can design a variety of triggers to activate a scenario using DSS. Let’s start with the simplest kind of trigger.

  • Within the Triggers panel of the Settings tab, click the Add Trigger dropdown button.

  • Add a Time-based trigger.

  • Instead of the default “Time-based”, name it Every 3 min.

  • Change “Repeat every” to 3 minutes.

  • Make sure its activity status is marked as ON.

Now every time new data is ingested, Dataiku will run the metrics to check data quality and model accuracy.

70 views0 comments

Recent Posts

See All
bottom of page