How to use GPT-4V For Stoplight Report
Efficient creation of a stoplight report with data dashboard images
Introduction
Comparing data dashboards is crucial for understanding trends and performance differences. Traditionally, this task required manual effort, which was slow and sometimes inaccurate. Now, thanks to OpenAI’s GPT-4 with Vision (GPT-4V), we are able to automate and improve this process. This article introduces how to use GPT-4V to compare two data dashboards quickly and accurately.
Traditionally, to compare two data dashboards and generate a stoplight report that highlights the differences between them, the process requires the machine to possess the capability to recognise and compare text within the images. This involves at least three steps:
- Text Extraction: Initially, data displayed on both dashboards must be extracted. This is achieved through Optical Character Recognition (OCR), which extracts text from the images.
- Data Extraction (Feature Selection): The next step is to identify the key metrics and features presented in both dashboards, focusing on the essential data for comparison.
- Comparison and Stoplight Report Generation: With the data extracted from the two dashboards, it’s possible to compare the differences and generate a stoplight report based on these findings.
However, customising the code structure at each step is often necessary. This is particularly true for comparing charts, graphs, or other visual elements within the dashboards, which requires an understanding of the types of visualisations used (e.g., bar charts, line graphs, heatmaps) and the data they represent.
ChatGPT makes life easy. ChatGPT, developed by OpenAI, can handle a wide range of natural language processing (NLP) tasks. In September 2023, OpenAI released the multimodal model GPT-4 with Vision (GPT-4V), which marries the language processing prowess of GPT-4 with image analysis capabilities, offering a comprehensive multimodal AI experience. With GPT-4V, the task of comparing data dashboards is greatly streamlined. By crafting an appropriate prompt and supplying the dashboard images to GPT-4V, the model can analyse the data within the images and provide a detailed stoplight report, making the comparison process more efficient and user-friendly.
What is GPT-4V and Stoplight report?
GPT-4 with vision enables users to instruct GPT-4 (a Large Multimodal Model) to analyse image inputs and text prompts provided by the user. With the ability to interpret and analyse data presented in visual formats, the GPT-4V is able to monitor the performance, identify trends and patterns, and enable Real-time Analysis.
A stoplight report uses the colours of a traffic light (red, yellow, green) to indicate the status or importance of the indicators or metrics. The colours represent:
- Green: Indicates that the metric or indicator is on track, performing well, or meeting its target. It signals that no immediate action is required.
- Yellow: Suggests caution. The metric is somewhat off target, or there may be potential issues that need attention. It serves as a warning that while not critical, some action may be needed to ensure the target is met or to improve performance.
- Red: Signifies that the metric or indicator is off target, performing poorly, or in a critical state. It alerts users that immediate action is necessary to address issues or to get back on track.
Example: ARC-ROR Data Dashboard
The following screenshots are from the same seafood sales dashboard on different dates. It basically presents the sales situations. We will use these two screenshots as examples to explain how to use LLM to generate stoplight reports for dashboards.
Step 0: OpenAi API Preparation
We need OpenAI Api Key to access the GPT-4V.
import base64
import requests
import json
import configparser
import os
# OpenAI API Key
current_path = os.path.dirname(os.path.realpath(__file__))
config = configparser.ConfigParser()
config.read(current_path + '/config.ini')
api_key = config['DEFAULT']['api_key']
Step 1: Define the stoplights
We define the rules for each indicator. It is noted that the order of sub-items matters, GPT would first consider the sub-item in the front. For example, once the sub-item in the Rule of Revenue ”If there are over 5% reduction(compared to old one) on Revenue, the stoplight should be red” is satisfied, GPT will turn the stoplight for this indicator into red and then move to the next indicator.
- Revenue:
1. If there are over 5% reduction(compared to old one) on Revenue, the stoplight should be red
2. If there are over 1% and under 5% reduction(compared to old one) on Revenue, the stoplight should be yellow
3. Otherwise, the stoplight is green
- Weight:
1. If there are over 10% change(compared to old one) on Weight, the stoplight should be yellow
2. Otherwise, the stoplight is green
- Orders:
1. If there are 10% reduction(compared to old one) on Order, the stoplight should be yellow
2. Otherwise, the stoplight is green
- Profit margin:
1. If there are over 5% reduction(compared to old one) on Margin, the stoplight should be red
2. If there are over 2% and under 5% reduction(compared to old one) on Margin, the stoplight should be yellow
3. Otherwise, the stoplight is green
- Inventory:
1. If there are 10% change(compared to old one) on Inventory, the stoplight should be red
2. If there are over 5% and under 10% change(compared to old one) on Inventory, the stoplight should be yellow
3. Otherwise, the stoplight is green
Step 2: Design the response template
We create a response template to provide the format and content range for the response from chatGPT4-V.
{
"Revenue": {
"First image": "12.16%",
"Second image": "12",
"Stoplight": "green"
},
"Weight": {
"First image": "0.53M",
"Second image": "1.0M",
"Stoplight": "Red"
},
"Order": {
"First image": "982",
"Second image": "900",
"Stoplight": "yellow"
},
"profit margin": {
"First image": "13.4%",
"Second image": "12.2%",
"Stoplight": "green"
},
"Inventory": {
"First image": "800.31K",
"Second image": "10.31K",
"Stoplight": "Red"
}
}
Step 3: Encode the images
Base64 encoding ensures that your image data can be safely embedded into text-based API requests, which is especially important for text-based data exchange formats like JSON. Also, by directly sending image data, it avoids security and privacy risks associated with image hosting issues or URL access permissions.
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
# Path to the images
image_path = "./NL0008/nl0008-1.1.0.png"
image_path_2 = "./NL0008/nl0008-1.0.8.png"
# Getting the base64 string
base64_image = encode_image(image_path)
base64_image_2 = encode_image(image_path_2)
Step 4: Call GPT-4V
We pass the images of data dashboards, system prompt which includes response template and the definition of stoplight, and user prompts to OpenAI’ s gpt-4-vision-preview model. Check here to know more about gpt-4-vision-preview.
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
payload = {
"model": "gpt-4-vision-preview",
"messages": [
{
"role": "system", "content": system_message,
},
{
"role": "user",
"content": [
# system prompt
{
"type": "text",
"text": "Can you help me compare the two images and tell me
what's difference?"
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
}
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image_2}"
}
}
]
}
],
"max_tokens": 500,
}
response = requests.post("https://api.openai.com/v1/chat/completions",
headers=headers, json=payload)
Output
OpenAI API gives the response following the design of the template and colour of the stoplight is well assigned, where we can clearly see the importance and the significant change of the metrics.
{
"Revenue": {
"First image": "$13.13M",
"Second image": "$12.11M",
"Stoplight": "Red"
},
"Weight": {
"First image": "0.59M",
"Second image": "0.55M",
"Stoplight": "green"
},
"Order": {
"First image": "1,054",
"Second image": "978",
"Stoplight": "Yellow"
},
"Profit margin": {
"First image": "13.75%",
"Second image": "13.99%",
"Stoplight": "green"
},
"Inventory": {
"First image": "961.20K",
"Second image": "875.46K",
"Stoplight": "Red"
}
}
We also convert data from JSON format to CSV format for enhanced visualisation.
import csv
import json
with open('data.json', 'r', encoding='utf-8') as json_file:
data = json.load(json_file)
if isinstance(data, str):
data = json.loads(data)
# csv writer
with open('structured_data.csv', 'w', newline='', encoding='utf-8') as file:
writer = csv.writer(file)
for category, details in data.items():
writer.writerow([
category,
details['First image'],
details['Second image'],
details['Stoplight']
])
Conclusion
GPT-4 Vision excels at interpreting and analysing data displayed in visual formats, including graphs, charts, and various data visualisations. It is highly effective in tracking and identifying differences between two data dashboards, providing valuable insights into changes and trends.
References
- OpenAI Documentation — Vision API — Learn how to use GPT-4 to understand images. https://platform.openai.com/docs/guides/vision
- How to Use GPT-4 Vision API. 9 Nov. 2023, gptpluginz.com/gpt-4-vision-api/. Accessed 17 Mar. 2024.
Catch the latest version of this article over on Medium.com. Hit the button below to join our readers there.