Analyze browsing behavior with Open AI
Google Analytics 4 is amazing, they introduced powerful AI-driven features like predictive metrics and predictive audiences. They do collect huge amount of events, data and you get report for almost anything.
These are all great for making better marketing decisions. From a marketing perspective however, what would be really useful is to see through visitors eyes, to take their shoes and understand what they see in the site, what drives their attention, what they click or wander whether to click or not.
We decided to explore how existing AI engines can help with that by creating an AI based extension for Magento 2 that provides not only content generation but collects browsing behavior and builds datasets that can be analyzed with the help of AI. As with AI, it is in its early stages and things are dynamically changing but we got some really exciting results and huge potential from marketing perspective.
In this article I have summarized a few key points and explained some concepts used for collecting and building datasets suitable for AI analysis.
Measuring user behavior
User behavior can usually be measured by collecting a few important metrics and events. Browser interactions include mainly mouse events such as mouseover, mouseleave, click, scroll and others. Collecting these into a dataset that can be analyzed is done via client-side Javascript implementation by adding a few event listeners to important or all elements from the page. They can be called areas of interests.
It is from great importance to identifiy elements in an unique way to ensure that events are correctly measured and attributed to the correct element scope, it is also important that this data is persisent accross interactions.
Since not all HTML elements contain unique identifier, for the purpose of classifying them, we can build a hash code that is based on the number of attributes, tag name, data-* attributes and more. From then on this can be assigned as data-hash attribute and then used later to identify the element in an unique way.
We shall focus on 2 important elements A (anchors) and (IMG) images.
The 'listen' method is applied to every A and IMG element and does 2 important things:
a) Calculates a hash string for the element and adds a data-hash attribute to that element
b) Adds events listeners to that element with all the callback functions. Each callback function tracks particular events such as mouseover, click etc. and stores this data into a JSON map object which on other hand is stored in browser localStorage()
It is designed to listen and record click, mouseover and mouseout events for all A and IMG elements on the site. This way, it can collect useful information on what elements were focused and click or just focused and not clicked. This produced a simple dataset like this:
All collected data is stored in seaprate table in database. This allows for AI analysis to be done at later stage.
AI-Driven analysis
With enough data, it's now possible for AI to analyze it. To facilitate this, we've implemented an Assistant API that can accept dataset files generated by the extension and perform further analysis. Initially, we considered uploading a JSON file directly, but this proved to be less cost-effective. Since API requests are based on input and output tokens, the JSON format isn't ideal due to its repetitive elements. To optimize token usage, we opted for the less flexible but more efficient CSV format, which significantly reduces the number of tokens and lowers costs.