|
8530
|
165
|
4
|
2026-04-14T06:50:22.089834+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149422089_m2.jpg...
|
Firefox
|
Jy 19798 evaluation for ai activity types by nikol Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet — Work...
|
1
|
github.com/jiminny/prophet/pull/468/changes#diff-d github.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8...
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Close tab
New Tab
Customize sidebar
Open Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
Skip to content
Skip to content
Open menu
Homepage (g then d)
jiminny
jiminny
prophet
prophet
Search or jump to…
Type
/
to search
Chat with Copilot
Open Copilot…
Create new...
Issues
Pull requests
Repositories
You have unread notifications(g then n)
Open user navigation menu
Repository navigation
Repository navigation
Code
Code
Issues
Issues
Pull requests (6)
Pull requests
(
6
)
Agents
Agents
Actions
Actions
Projects
Projects
Wiki
Wiki
Security and quality (20)
Security and quality
(
20
)
Insights
Insights
Settings
Settings
Important update
Important update
On April 24 we'll start using GitHub Copilot interaction data for AI model training unless you opt out.
Review this update
Review this update
and manage your preferences in your
GitHub account settings
GitHub account settings
.
Dismiss banner
Jy 19798 evaluation for ai activity types #468 Edit title
Jy 19798 evaluation for ai activity types
#
468
Edit title
Preview
Preview
Awaiting approval
Awaiting approval
Code
Code
Open
nikolaybiaivanov
nikolaybiaivanov
wants to merge 9 commits into
master
master
from
JY-19798-evaluation-for-ai-activity-types
JY-19798-evaluation-for-ai-activity-types
Copy head branch name to clipboard
Lines changed: 25671 additions & 0 deletions
Conversation (3)
Conversation
(
3
)
Commits (9)
Commits
(
9
)
Checks (3)
Checks
(
3
)
Files changed (203)
Files changed
(
203
)
Pull Request Toolbar
Pull Request Toolbar
Collapse file tree
Open
Jy 19798 evaluation for ai activity types
Jy 19798 evaluation for ai activity types
#
468
All commits
All commits
nikolaybiaivanov
nikolaybiaivanov
wants to merge 9 commits into
master
master
from
JY-19798-evaluation-for-ai-activity-types
JY-19798-evaluation-for-ai-activity-types
Copy head branch name to clipboard
2
/
203
viewed
Awaiting approval
Awaiting approval
Submit review
Submit
review
Open diff view settings
Open overview panel
Open comments panel
(
0
)
Filter files…
Filter options
File tree
File tree
eval_output/ai_activity_type/eval_datasets/v2-from-logs-replayable
ground_truth
input
src/eval
endpoints.py
endpoints.py
eval-ai-activity-type.py
eval-ai-activity-type.py...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app","depth":4,"bounds":{"left":0.00234375,"top":0.045138888,"width":0.0890625,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":4,"bounds":{"left":0.0,"top":0.08263889,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":5,"bounds":{"left":0.015625,"top":0.09236111,"width":0.11796875,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":4,"bounds":{"left":0.0,"top":0.11111111,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":5,"bounds":{"left":0.015625,"top":0.12083333,"width":0.1984375,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.07890625,"top":0.11736111,"width":0.009375,"height":0.016666668},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.003125,"top":0.14097223,"width":0.08710937,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.003125,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open Google Gemini (⌃X)","depth":6,"bounds":{"left":0.01640625,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Tabs from other devices","depth":6,"bounds":{"left":0.029296875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.0421875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.05546875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXLink","text":"Skip to content","depth":6,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Skip to content","depth":7,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open menu","depth":10,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXLink","text":"Homepage (g then d)","depth":9,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXLink","text":"jiminny","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"jiminny","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"prophet","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"prophet","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Search or jump to…","depth":9,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Type","depth":12,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":12,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to search","depth":12,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Chat with Copilot","depth":10,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXMenuButton","text":"Open Copilot…","depth":9,"help_text":"","role_description":"menu button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXMenuButton","text":"Create new...","depth":9,"help_text":"","role_description":"menu button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXLink","text":"Issues","depth":9,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXLink","text":"Pull requests","depth":9,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXLink","text":"Repositories","depth":9,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXLink","text":"You have unread notifications(g then n)","depth":9,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open user navigation menu","depth":9,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Repository navigation","depth":9,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Repository navigation","depth":10,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Code","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Code","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Issues","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Issues","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Pull requests (6)","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pull requests","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"6","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":")","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Agents","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Agents","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Actions","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Actions","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Projects","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Projects","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Wiki","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Wiki","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Security and quality (20)","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Security and quality","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"20","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":")","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Insights","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Insights","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Settings","depth":12,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Settings","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Important update","depth":10,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Important update","depth":11,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"On April 24 we'll start using GitHub Copilot interaction data for AI model training unless you opt out.","depth":10,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Review this update","depth":10,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Review this update","depth":11,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and manage your preferences in your","depth":10,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"GitHub account settings","depth":10,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"GitHub account settings","depth":11,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":10,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Dismiss banner","depth":9,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Jy 19798 evaluation for ai activity types #468 Edit title","depth":13,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"#","depth":15,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"468","depth":15,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Edit title","depth":14,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXMenuButton","text":"Preview","depth":13,"help_text":"","role_description":"menu button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Preview","depth":15,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Awaiting approval","depth":13,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Awaiting approval","depth":15,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXMenuButton","text":"Code","depth":13,"help_text":"","role_description":"menu button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Code","depth":15,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open","depth":13,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"nikolaybiaivanov","depth":15,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"nikolaybiaivanov","depth":16,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"wants to merge 9 commits into","depth":15,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"master","depth":15,"role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"master","depth":16,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"from","depth":16,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"JY-19798-evaluation-for-ai-activity-types","depth":16,"role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"JY-19798-evaluation-for-ai-activity-types","depth":17,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy head branch name to clipboard","depth":16,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Lines changed: 25671 additions & 0 deletions","depth":14,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Conversation (3)","depth":16,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Conversation","depth":17,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":")","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Commits (9)","depth":16,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Commits","depth":17,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"9","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":")","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Checks (3)","depth":16,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Checks","depth":17,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":")","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Files changed (203)","depth":16,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Files changed","depth":17,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"203","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":")","depth":18,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Pull Request Toolbar","depth":14,"bounds":{"left":0.10625,"top":0.065972224,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Pull Request Toolbar","depth":15,"bounds":{"left":0.10625,"top":0.068055555,"width":0.03515625,"height":0.07083333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Collapse file tree","depth":14,"bounds":{"left":0.10625,"top":0.056944445,"width":0.0109375,"height":0.019444445},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"Open","depth":14,"bounds":{"left":0.1328125,"top":0.060416665,"width":0.0140625,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Jy 19798 evaluation for ai activity types","depth":14,"bounds":{"left":0.1546875,"top":0.050694443,"width":0.1046875,"height":0.014583333},"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types","depth":16,"bounds":{"left":0.1546875,"top":0.052083332,"width":0.1046875,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"#","depth":15,"bounds":{"left":0.2625,"top":0.052083332,"width":0.003515625,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"468","depth":15,"bounds":{"left":0.26601562,"top":0.052083332,"width":0.01015625,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"All commits","depth":14,"bounds":{"left":0.1515625,"top":0.0625,"width":0.03984375,"height":0.019444445},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All commits","depth":16,"bounds":{"left":0.15507813,"top":0.06736111,"width":0.0265625,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"nikolaybiaivanov","depth":15,"bounds":{"left":0.19648437,"top":0.065972224,"width":0.037890624,"height":0.0125},"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"nikolaybiaivanov","depth":16,"bounds":{"left":0.19648437,"top":0.06736111,"width":0.037890624,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"wants to merge 9 commits into","depth":15,"bounds":{"left":0.2359375,"top":0.06736111,"width":0.068359375,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"master","depth":15,"bounds":{"left":0.3058594,"top":0.06458333,"width":0.021484375,"height":0.015277778},"role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"master","depth":16,"bounds":{"left":0.30820313,"top":0.06736111,"width":0.016796876,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"from","depth":16,"bounds":{"left":0.32890624,"top":0.06736111,"width":0.01015625,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"JY-19798-evaluation-for-ai-activity-types","depth":16,"bounds":{"left":0.340625,"top":0.06458333,"width":0.1203125,"height":0.015277778},"role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"JY-19798-evaluation-for-ai-activity-types","depth":17,"bounds":{"left":0.34296876,"top":0.06736111,"width":0.115625,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy head branch name to clipboard","depth":16,"bounds":{"left":0.4625,"top":0.0625,"width":0.0109375,"height":0.019444445},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2","depth":15,"bounds":{"left":0.7796875,"top":0.06111111,"width":0.003125,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":15,"bounds":{"left":0.7828125,"top":0.06111111,"width":0.002734375,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"203","depth":15,"bounds":{"left":0.7867187,"top":0.06111111,"width":0.008984375,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"viewed","depth":15,"bounds":{"left":0.796875,"top":0.06111111,"width":0.015625,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Awaiting approval","depth":14,"bounds":{"left":0.8222656,"top":0.056944445,"width":0.0546875,"height":0.019444445},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Awaiting approval","depth":16,"bounds":{"left":0.8335937,"top":0.06111111,"width":0.03984375,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Submit review","depth":14,"bounds":{"left":0.88007814,"top":0.056944445,"width":0.0453125,"height":0.019444445},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Submit","depth":16,"bounds":{"left":0.88359374,"top":0.06111111,"width":0.0171875,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"review","depth":16,"bounds":{"left":0.9007813,"top":0.06111111,"width":0.01484375,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXMenuButton","text":"Open diff view settings","depth":14,"bounds":{"left":0.9285156,"top":0.056944445,"width":0.0109375,"height":0.019444445},"help_text":"","role_description":"menu button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open overview panel","depth":14,"bounds":{"left":0.94921875,"top":0.056944445,"width":0.0109375,"height":0.019444445},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open comments panel","depth":14,"bounds":{"left":0.96171874,"top":0.056944445,"width":0.0203125,"height":0.019444445},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"(","depth":16,"bounds":{"left":0.97148436,"top":0.06111111,"width":0.003125,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":16,"bounds":{"left":0.9746094,"top":0.06111111,"width":0.003125,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":")","depth":16,"bounds":{"left":0.9777344,"top":0.06111111,"width":0.001953125,"height":0.010416667},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"Filter files…","depth":16,"bounds":{"left":0.11953125,"top":0.09861111,"width":0.080078125,"height":0.020833334},"help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXMenuButton","text":"Filter options","depth":16,"bounds":{"left":0.203125,"top":0.09791667,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"menu button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"File tree","depth":15,"bounds":{"left":0.10664062,"top":0.13125,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"File tree","depth":16,"bounds":{"left":0.10664062,"top":0.13333334,"width":0.016796876,"height":0.045833334},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"eval_output/ai_activity_type/eval_datasets/v2-from-logs-replayable","depth":19,"bounds":{"left":0.12539062,"top":0.13680555,"width":0.17070313,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ground_truth","depth":21,"bounds":{"left":0.12851563,"top":0.15902779,"width":0.033203125,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"input","depth":21,"bounds":{"left":0.12851563,"top":0.18125,"width":0.012890625,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"src/eval","depth":19,"bounds":{"left":0.12539062,"top":0.20347223,"width":0.01953125,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"endpoints.py","depth":21,"bounds":{"left":0.12851563,"top":0.22569445,"width":0.0328125,"height":0.011805556},"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"endpoints.py","depth":22,"bounds":{"left":0.12851563,"top":0.22569445,"width":0.0328125,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"eval-ai-activity-type.py","depth":21,"bounds":{"left":0.12851563,"top":0.24861111,"width":0.058984376,"height":0.011805556},"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"eval-ai-activity-type.py","depth":22,"bounds":{"left":0.12851563,"top":0.24861111,"width":0.058984376,"height":0.011805556},"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
5864725084621537271
|
1401881522361609451
|
click
|
accessibility
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Close tab
New Tab
Customize sidebar
Open Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
Skip to content
Skip to content
Open menu
Homepage (g then d)
jiminny
jiminny
prophet
prophet
Search or jump to…
Type
/
to search
Chat with Copilot
Open Copilot…
Create new...
Issues
Pull requests
Repositories
You have unread notifications(g then n)
Open user navigation menu
Repository navigation
Repository navigation
Code
Code
Issues
Issues
Pull requests (6)
Pull requests
(
6
)
Agents
Agents
Actions
Actions
Projects
Projects
Wiki
Wiki
Security and quality (20)
Security and quality
(
20
)
Insights
Insights
Settings
Settings
Important update
Important update
On April 24 we'll start using GitHub Copilot interaction data for AI model training unless you opt out.
Review this update
Review this update
and manage your preferences in your
GitHub account settings
GitHub account settings
.
Dismiss banner
Jy 19798 evaluation for ai activity types #468 Edit title
Jy 19798 evaluation for ai activity types
#
468
Edit title
Preview
Preview
Awaiting approval
Awaiting approval
Code
Code
Open
nikolaybiaivanov
nikolaybiaivanov
wants to merge 9 commits into
master
master
from
JY-19798-evaluation-for-ai-activity-types
JY-19798-evaluation-for-ai-activity-types
Copy head branch name to clipboard
Lines changed: 25671 additions & 0 deletions
Conversation (3)
Conversation
(
3
)
Commits (9)
Commits
(
9
)
Checks (3)
Checks
(
3
)
Files changed (203)
Files changed
(
203
)
Pull Request Toolbar
Pull Request Toolbar
Collapse file tree
Open
Jy 19798 evaluation for ai activity types
Jy 19798 evaluation for ai activity types
#
468
All commits
All commits
nikolaybiaivanov
nikolaybiaivanov
wants to merge 9 commits into
master
master
from
JY-19798-evaluation-for-ai-activity-types
JY-19798-evaluation-for-ai-activity-types
Copy head branch name to clipboard
2
/
203
viewed
Awaiting approval
Awaiting approval
Submit review
Submit
review
Open diff view settings
Open overview panel
Open comments panel
(
0
)
Filter files…
Filter options
File tree
File tree
eval_output/ai_activity_type/eval_datasets/v2-from-logs-replayable
ground_truth
input
src/eval
endpoints.py
endpoints.py
eval-ai-activity-type.py
eval-ai-activity-type.py...
|
NULL
|
|
8533
|
165
|
6
|
2026-04-14T06:50:29.073263+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149429073_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Firefox File Edit View History Bookmarks Profiles Firefox File Edit View History Bookmarks Profiles Tools Window Help••+→ cPlatform Sprint 1 Q2 - Platform Tea( Jy 19798 evaluation for ai activ X+ New labi? Open© github.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8Jy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types> src/eval/endpoints.py 0 +Q Filter files...• E eval_output/ai_activity_type/eval….> D ground_truth>B input• E src/evalendpoints.pyD eval-ai-activity-type.py*.gitignorev src/eval/eval-ai-activity-type.py0c -0,0 +1,330 gg24•< 4o ll • | Daily - Platform • nowA100% C• 8 • Tue 14 Apr 9:50:28^ 2 1 203 viewed• Awallind aporovaliSuomitrevewy+5 -n Viewed+330 UUU Viewed+ #!/usr/bin/env python3|+ Import async1o+ import json+Import os+ Trom pathlib import Path+ Trom Lyoino 1mporl Any, UIcl L150+ Import dlonuuo+ from src.component.call_processing.schemes.ai activity_ type_request import AiActivityTypeRequest+ Trom src.component.call_processing.schemes.al_activity-type_response import Alactivitylyperesponse+ from src.eval.abstract_evaluator import AbstractEvaluator+ Trom src.eval.couleccors imporr collecc tesc ser as es collect test seu+ from src.eval.common_evaluator_runner import create_parser, run_evaluation+ from src.eval.endpoints import run_endpoint as run_endpoint_http+ trom src.eval.models import EvaluationResult+ class AiActivityTypeEvaluator(AbstractEvaluator):ENDPOINT_PATH = "call/ai-activity-type"REQUEST_MODEL = AiActivityTypeRequestRESPONSE_MODEL = AiActivityTypeResponseOUTPUT_SUBFOLDER = "ai_activity_type"USES_LLM_EVALUATION = FalseREADS_FROM_DATASET_FOLDER = TrueDATA_SET: str = "initial"MINIMUM_REQUEST_BODY_SIZE: int = 0SKIP_UNREPLAYABLE_ON_COLLECT: bool = TrueREPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.0POPULATE_GROUND_TRUTH_LLM: bool = FalseGROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"COMPARE_TO_GROUND_TRUTH: str = "es"def _init_(self, **kwargs: Any) -> None:super()._init_(**kwargs)_repo_root = Path(_file_).parents[2]_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"_dataset_dir = _datasets_root / self._class_.DATA_SETself._dataset_dir =_dataset_dirself._inputs_dir = _dataset_dir / "input"self._ground_truths_dir = _dataset_dir / "ground_truth"async der coltect_test_seu(seut) → Listibicuistr, AnyJu:if self.es host name:...
|
NULL
|
144205119170133136
|
NULL
|
click
|
ocr
|
NULL
|
Firefox File Edit View History Bookmarks Profiles Firefox File Edit View History Bookmarks Profiles Tools Window Help••+→ cPlatform Sprint 1 Q2 - Platform Tea( Jy 19798 evaluation for ai activ X+ New labi? Open© github.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8Jy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types> src/eval/endpoints.py 0 +Q Filter files...• E eval_output/ai_activity_type/eval….> D ground_truth>B input• E src/evalendpoints.pyD eval-ai-activity-type.py*.gitignorev src/eval/eval-ai-activity-type.py0c -0,0 +1,330 gg24•< 4o ll • | Daily - Platform • nowA100% C• 8 • Tue 14 Apr 9:50:28^ 2 1 203 viewed• Awallind aporovaliSuomitrevewy+5 -n Viewed+330 UUU Viewed+ #!/usr/bin/env python3|+ Import async1o+ import json+Import os+ Trom pathlib import Path+ Trom Lyoino 1mporl Any, UIcl L150+ Import dlonuuo+ from src.component.call_processing.schemes.ai activity_ type_request import AiActivityTypeRequest+ Trom src.component.call_processing.schemes.al_activity-type_response import Alactivitylyperesponse+ from src.eval.abstract_evaluator import AbstractEvaluator+ Trom src.eval.couleccors imporr collecc tesc ser as es collect test seu+ from src.eval.common_evaluator_runner import create_parser, run_evaluation+ from src.eval.endpoints import run_endpoint as run_endpoint_http+ trom src.eval.models import EvaluationResult+ class AiActivityTypeEvaluator(AbstractEvaluator):ENDPOINT_PATH = "call/ai-activity-type"REQUEST_MODEL = AiActivityTypeRequestRESPONSE_MODEL = AiActivityTypeResponseOUTPUT_SUBFOLDER = "ai_activity_type"USES_LLM_EVALUATION = FalseREADS_FROM_DATASET_FOLDER = TrueDATA_SET: str = "initial"MINIMUM_REQUEST_BODY_SIZE: int = 0SKIP_UNREPLAYABLE_ON_COLLECT: bool = TrueREPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.0POPULATE_GROUND_TRUTH_LLM: bool = FalseGROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"COMPARE_TO_GROUND_TRUTH: str = "es"def _init_(self, **kwargs: Any) -> None:super()._init_(**kwargs)_repo_root = Path(_file_).parents[2]_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"_dataset_dir = _datasets_root / self._class_.DATA_SETself._dataset_dir =_dataset_dirself._inputs_dir = _dataset_dir / "input"self._ground_truths_dir = _dataset_dir / "ground_truth"async der coltect_test_seu(seut) → Listibicuistr, AnyJu:if self.es host name:...
|
NULL
|
|
8535
|
164
|
4
|
2026-04-14T06:50:32.324802+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149432324_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelp$0.| Daily - Platform • nowmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2) - niE:1Q 8•Tue 14 Apr 9:50C 3y 19798 eval(7 Service-Deskjr Foet Ameson• Cost | Claude x©Actions secrC C DependatActions sec• Dependabot: xHa Dependatinny/prophet/actions/runs/24195639151/ob/70624720594?pr=468B AWSE3 SSHс *El Datados * Clude5 CircieclD Sentry8 TaskslInsights & Coachin.• Dev• Ux.0 Al BookmarksRun Flake8 & Mypy•, Search logs ISummary|Run Flake8 & MypyRun details© UsageWorkfiow fileStefka StoyanovalHDSet up Python> O Install Poetry|> • Load cached Poetry dependencies |> © Install dependencies using Poetry|v @ Run Flake81 •Run poetry run flakeß, -count -select-E9,F63,F7,F82,N293,N291,W292 --show-source -statistics./src/eval/eval-al-activity-type.py:330:63: N292 no newline at end of fileasyncio.runtrun_cvaluatzon(AlactavityTypetvaluator, args)Error: Process conpleted with exit code 1.• Run Mypy|• Post Load cached Poetry dependencies |• Post Set up Python |› © Post Checkout code> O Complete job• 2026 0i4u6, Inc.HortroPrivacy Security Status Community Docsvonest menesn25s7sSteliyan GeorgievKovalik9:50 AM | Daily - Platform5:088 • Tue 14 Apr 9:50:32Chat6Nikolay YankovNikolay Ivanov...
|
NULL
|
-7296810352464578903
|
NULL
|
click
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelp$0.| Daily - Platform • nowmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2) - niE:1Q 8•Tue 14 Apr 9:50C 3y 19798 eval(7 Service-Deskjr Foet Ameson• Cost | Claude x©Actions secrC C DependatActions sec• Dependabot: xHa Dependatinny/prophet/actions/runs/24195639151/ob/70624720594?pr=468B AWSE3 SSHс *El Datados * Clude5 CircieclD Sentry8 TaskslInsights & Coachin.• Dev• Ux.0 Al BookmarksRun Flake8 & Mypy•, Search logs ISummary|Run Flake8 & MypyRun details© UsageWorkfiow fileStefka StoyanovalHDSet up Python> O Install Poetry|> • Load cached Poetry dependencies |> © Install dependencies using Poetry|v @ Run Flake81 •Run poetry run flakeß, -count -select-E9,F63,F7,F82,N293,N291,W292 --show-source -statistics./src/eval/eval-al-activity-type.py:330:63: N292 no newline at end of fileasyncio.runtrun_cvaluatzon(AlactavityTypetvaluator, args)Error: Process conpleted with exit code 1.• Run Mypy|• Post Load cached Poetry dependencies |• Post Set up Python |› © Post Checkout code> O Complete job• 2026 0i4u6, Inc.HortroPrivacy Security Status Community Docsvonest menesn25s7sSteliyan GeorgievKovalik9:50 AM | Daily - Platform5:088 • Tue 14 Apr 9:50:32Chat6Nikolay YankovNikolay Ivanov...
|
NULL
|
|
8536
|
165
|
8
|
2026-04-14T06:50:32.350923+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149432350_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...l1 OpenJy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types LPlatform Sprint 1 Q2 - Platform TeaGeminiPROsrc/eval/endpoints.pyEQ Filter files...( Jy 19798 evaluation for ai activ+ New lab• E eval_output/ai_activity_type/eval.….> • ground_truth>I inoutv src/eval/eval-ai-activity-type.py..•0a -0,0 +1,330 ggE src/evalE endpoints.pyE eval-ai-activity-type.py#.gitignore1 Daily - Platform • nowA• 2 1 203 viewed|• Awallind aporovall100%8 • Tue 14 Apr 9:50:3145 П0OOOT• Viewed+330 UuuuuVlewedHi LukasWhere should westart?Create image* Create musicBoost my dayHelp me learnWrite anythingO Enter a prompt for Gemini+New]summarise pagePro v+ #!/usr/bin/env python3+ 1mport asynclo+ import json+ import os+ from pathlib import Path+ from typing import Any, Dict, List+ import aiohttp+ from src.component.call_processing.schemes.ai activity type_request importALACELVLEV voekequest11+ from src.component.call_processing.schemes.ai_activity_type_response importA1ACCIVICYIypekesponse+ from src.eval.abstract_evaluator import AbstractEvaluator+ from src.eval.collectors import collect test set as es collect test set+ from src.eval.common_evaluator_runner import create_parser, run_evaluation+ from src.eval.endpoints import run endpoint as run endpoint http+ trom src,eval,models 1moort Evaluatzonkesult23+ class AiActivityTypeEvaluator(AbstractEvaluator):ENDPOINT_PATH = "call/ai-activity-type"REQUEST_MODEL = AiActivityTypeRequestRESrUNst MoDEL = ALACCIviLyIvpekesponseOUTPUT_SUBFOLDER = "ai_activity_type"USES_LLM_EVALUATION = FalseREADS_FROM_DATASET_FOLDER = TrueDATA_SET: str = "initial"MINIMUM_REQUEST_BODY_SIZE: int = 0SKIP_UNREPLAYABLE_ON_COLLECT: bool = TrueREPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.0rurULAlL onuuNu Ikuin LlM. Dool = IalseGROUND TRUTH LIM ENDPOINT: str = "OPENAL:apt=5 2"LUMPAKE TO GKUUND TKUin: Str=esdeт 1nic selt, **kwaros. Any -> None:super()._init_(**kwargs)_repo_root = Path(_file_).parents (2]_datasets_root =_repo_root / "eval_output" / self.OUTPUT_SUBFOLDER /"eval datasets"_dataset_dir = _datasets_root / self._class_.DATA_SETself._dataset_dir =_dataset_dirselt._inputs_air = _dataset_dar / "input"self. around truths dir = dataset dir f "around +ruth"...
|
NULL
|
-7084697545025634029
|
NULL
|
click
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...l1 OpenJy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types LPlatform Sprint 1 Q2 - Platform TeaGeminiPROsrc/eval/endpoints.pyEQ Filter files...( Jy 19798 evaluation for ai activ+ New lab• E eval_output/ai_activity_type/eval.….> • ground_truth>I inoutv src/eval/eval-ai-activity-type.py..•0a -0,0 +1,330 ggE src/evalE endpoints.pyE eval-ai-activity-type.py#.gitignore1 Daily - Platform • nowA• 2 1 203 viewed|• Awallind aporovall100%8 • Tue 14 Apr 9:50:3145 П0OOOT• Viewed+330 UuuuuVlewedHi LukasWhere should westart?Create image* Create musicBoost my dayHelp me learnWrite anythingO Enter a prompt for Gemini+New]summarise pagePro v+ #!/usr/bin/env python3+ 1mport asynclo+ import json+ import os+ from pathlib import Path+ from typing import Any, Dict, List+ import aiohttp+ from src.component.call_processing.schemes.ai activity type_request importALACELVLEV voekequest11+ from src.component.call_processing.schemes.ai_activity_type_response importA1ACCIVICYIypekesponse+ from src.eval.abstract_evaluator import AbstractEvaluator+ from src.eval.collectors import collect test set as es collect test set+ from src.eval.common_evaluator_runner import create_parser, run_evaluation+ from src.eval.endpoints import run endpoint as run endpoint http+ trom src,eval,models 1moort Evaluatzonkesult23+ class AiActivityTypeEvaluator(AbstractEvaluator):ENDPOINT_PATH = "call/ai-activity-type"REQUEST_MODEL = AiActivityTypeRequestRESrUNst MoDEL = ALACCIviLyIvpekesponseOUTPUT_SUBFOLDER = "ai_activity_type"USES_LLM_EVALUATION = FalseREADS_FROM_DATASET_FOLDER = TrueDATA_SET: str = "initial"MINIMUM_REQUEST_BODY_SIZE: int = 0SKIP_UNREPLAYABLE_ON_COLLECT: bool = TrueREPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.0rurULAlL onuuNu Ikuin LlM. Dool = IalseGROUND TRUTH LIM ENDPOINT: str = "OPENAL:apt=5 2"LUMPAKE TO GKUUND TKUin: Str=esdeт 1nic selt, **kwaros. Any -> None:super()._init_(**kwargs)_repo_root = Path(_file_).parents (2]_datasets_root =_repo_root / "eval_output" / self.OUTPUT_SUBFOLDER /"eval datasets"_dataset_dir = _datasets_root / self._class_.DATA_SETself._dataset_dir =_dataset_dirselt._inputs_air = _dataset_dar / "input"self. around truths dir = dataset dir f "around +ruth"...
|
NULL
|
|
8539
|
165
|
10
|
2026-04-14T06:50:36.563336+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149436563_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
firefoxFileEoitViewHistoryBookmarksProfilesToolsWi firefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpgithub.com/liminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab/dd0782d4a4d93075ca5ce05903d77e8Google Gemini...Platform Sprint 1 Q2 - Platform Tea( Jy 19798 evaluation for ai activ+ New labim on page"<tab Itle>Jv19798evalarion tor allactivity types bynikol</tabTitle>"...Jy 19798 evaluation for ai activity tvpes #468i OperAll commits-nikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types L• Filter files...src/eval/endpoints.pyEv = eval_ output/ai_activity_type/eval...v src/eval/eval-ai-activity-type.py L> • ground_truth..•aa-0.0 +1.330 ad>I inoutF src/eval= enapolnts.py+ eval-ai-activity-type.pyE .gitignoreDaily - Platform • now• 2 1 203 viewed|Awallind aporovali100%• Tue 14 Apr 9:50:3645 nannOI• Viewed+330 UUUUuVlewedInis Pytnon seript derines theAlActivltylypetvaluator class, aspecialized tool tor evaluating Al-arivenactivity type classification. It extends a baseevaluation namework to nanole datacollection, ground-truth comparison, andperformance scoringCore Functionality0 Enter a oromot for GeminiPro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour orivacy & GeminSummarise page+ #!/usr/bin/env python3+ 1mport asynclo+ import json+ import os+ trom pathlib import Path+ from typing import Any, Dict, List+ import aiohttp+ from src.component.call_processing.schemes.ai activity type_request importALACELVLEV voekequest11+ from src.component.call_processing.schemes.ai activity type_response importA1ACCIVICYIypekesponse+ trom src.eval.abstract_evaluator import Abstractevaluator+ from src.eval.collectors import collect test set as es collect test set+ trom src.eval.common_evaluator_runner import create_parser, run_evaluation+ from src.eval.endpoints import run endpoint as run endpoint http+ trom src,eval,models 1moort Evaluatzonkesult+ class AiActivityTypeEvaluator(AbstractEvaluator):ENDPOINT PATH = "call/ai-activity-type"REQUEST_MODEL = AiActivityTypeRequestRESrUNst MoDEL = ALACCIviLyIvpekesponse23ouTPUT_sUBFOLDER = "a1_activity_type"USES LLM EVALUATION = FalseREADS_FROM_DATASET_ FOLDER = TrueDATA SET: str = "initial"MINIMUM_REQUEST BODY_ SIZE: int = 0SKLP_UNREPLAYABLE_ON_COLLECT: bool = TrueREPLAYABILITY CHECK TIMEOUT SECONDS: float = 20.0rurULAlL onuuNu Ikuin LlM. Dool = IalseGROUND TRUTH LIM ENDPOINT: str = "OPENAL:apt=5 2"LUMPAKE TU GKUUND IKuin: Str=esdeт 1nic selt, **kwaros. Any -> None:super().__init__(**kwargs)_repo_root = Path(_file_).parents (2]_datasets_root = _repo_root / "eval_output" / selt.OUTPUT_SUBFOLDER /"eval datasets"_dataset_dir = _datasets_root / selt.__class__.DATA_SE!self._dataset_dir =_dataset_dirselt._inputs_air =_dataset_air / "Input"self. around truths dir= datacet dir "around truth'l...
|
NULL
|
-7815770597961825444
|
NULL
|
visual_change
|
ocr
|
NULL
|
firefoxFileEoitViewHistoryBookmarksProfilesToolsWi firefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpgithub.com/liminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab/dd0782d4a4d93075ca5ce05903d77e8Google Gemini...Platform Sprint 1 Q2 - Platform Tea( Jy 19798 evaluation for ai activ+ New labim on page"<tab Itle>Jv19798evalarion tor allactivity types bynikol</tabTitle>"...Jy 19798 evaluation for ai activity tvpes #468i OperAll commits-nikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types L• Filter files...src/eval/endpoints.pyEv = eval_ output/ai_activity_type/eval...v src/eval/eval-ai-activity-type.py L> • ground_truth..•aa-0.0 +1.330 ad>I inoutF src/eval= enapolnts.py+ eval-ai-activity-type.pyE .gitignoreDaily - Platform • now• 2 1 203 viewed|Awallind aporovali100%• Tue 14 Apr 9:50:3645 nannOI• Viewed+330 UUUUuVlewedInis Pytnon seript derines theAlActivltylypetvaluator class, aspecialized tool tor evaluating Al-arivenactivity type classification. It extends a baseevaluation namework to nanole datacollection, ground-truth comparison, andperformance scoringCore Functionality0 Enter a oromot for GeminiPro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour orivacy & GeminSummarise page+ #!/usr/bin/env python3+ 1mport asynclo+ import json+ import os+ trom pathlib import Path+ from typing import Any, Dict, List+ import aiohttp+ from src.component.call_processing.schemes.ai activity type_request importALACELVLEV voekequest11+ from src.component.call_processing.schemes.ai activity type_response importA1ACCIVICYIypekesponse+ trom src.eval.abstract_evaluator import Abstractevaluator+ from src.eval.collectors import collect test set as es collect test set+ trom src.eval.common_evaluator_runner import create_parser, run_evaluation+ from src.eval.endpoints import run endpoint as run endpoint http+ trom src,eval,models 1moort Evaluatzonkesult+ class AiActivityTypeEvaluator(AbstractEvaluator):ENDPOINT PATH = "call/ai-activity-type"REQUEST_MODEL = AiActivityTypeRequestRESrUNst MoDEL = ALACCIviLyIvpekesponse23ouTPUT_sUBFOLDER = "a1_activity_type"USES LLM EVALUATION = FalseREADS_FROM_DATASET_ FOLDER = TrueDATA SET: str = "initial"MINIMUM_REQUEST BODY_ SIZE: int = 0SKLP_UNREPLAYABLE_ON_COLLECT: bool = TrueREPLAYABILITY CHECK TIMEOUT SECONDS: float = 20.0rurULAlL onuuNu Ikuin LlM. Dool = IalseGROUND TRUTH LIM ENDPOINT: str = "OPENAL:apt=5 2"LUMPAKE TU GKUUND IKuin: Str=esdeт 1nic selt, **kwaros. Any -> None:super().__init__(**kwargs)_repo_root = Path(_file_).parents (2]_datasets_root = _repo_root / "eval_output" / selt.OUTPUT_SUBFOLDER /"eval datasets"_dataset_dir = _datasets_root / selt.__class__.DATA_SE!self._dataset_dir =_dataset_dirselt._inputs_air =_dataset_air / "Input"self. around truths dir= datacet dir "around truth'l...
|
NULL
|
|
8540
|
164
|
6
|
2026-04-14T06:50:39.019186+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149439019_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelp$0.| Daily - Platform • nowmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2) - niE:1Q 8•Tue 14 Apr 9:50C 3y 19798 eval(7 Service-Deskjr Foet Ameson• Cost | Claude x©Actions secrC C DependatActions sec• Dependabot: xHa Dependatinny/prophet/actions/runs/24195639151/ob/70624720594?pr=468B AWSE3 SSHс *El Datados * Clude5 CircieclD Sentry8 TaskslInsights & Coachin.• Dev• Ux.0 Al BookmarksRun Flake8 & Mypy•, Search logs ISummary|Run Flake8 & MypyRun details© UsageWorkfiow fileStefka StoyanovaHDSet up Python> O Install Poetry|> • Load cached Poetry dependencies |> © Install dependencies using Poetry|v @ Run Flake81 •Run poetry run flakeß, -count -select-E9,F63,F7,F82,N293,N291,W292 --show-source -statistics./src/eval/eval-al-activity-type.py:330:63: N292 no newline at end of fileasyncio.runtrun_cvaluatzon(AlactavityTypetvaluator, args)Error: Process conpleted with exit code 1.• Run Mypy|• Post Load cached Poetry dependencies |• Post Set up Python|› © Post Checkout code> O Complete job• 2026 0i4u6, Inc.HortroPrivacy Security Status Community Docsvonest menesn25san GeorgievKovalik9:50 AM | Daily - Platform5:148 • Tue 14 Apr 9:50:38Chat6Nikolay YankovNikolay Ivanov...
|
NULL
|
-8188601352334981323
|
NULL
|
visual_change
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelp$0.| Daily - Platform • nowmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2) - niE:1Q 8•Tue 14 Apr 9:50C 3y 19798 eval(7 Service-Deskjr Foet Ameson• Cost | Claude x©Actions secrC C DependatActions sec• Dependabot: xHa Dependatinny/prophet/actions/runs/24195639151/ob/70624720594?pr=468B AWSE3 SSHс *El Datados * Clude5 CircieclD Sentry8 TaskslInsights & Coachin.• Dev• Ux.0 Al BookmarksRun Flake8 & Mypy•, Search logs ISummary|Run Flake8 & MypyRun details© UsageWorkfiow fileStefka StoyanovaHDSet up Python> O Install Poetry|> • Load cached Poetry dependencies |> © Install dependencies using Poetry|v @ Run Flake81 •Run poetry run flakeß, -count -select-E9,F63,F7,F82,N293,N291,W292 --show-source -statistics./src/eval/eval-al-activity-type.py:330:63: N292 no newline at end of fileasyncio.runtrun_cvaluatzon(AlactavityTypetvaluator, args)Error: Process conpleted with exit code 1.• Run Mypy|• Post Load cached Poetry dependencies |• Post Set up Python|› © Post Checkout code> O Complete job• 2026 0i4u6, Inc.HortroPrivacy Security Status Community Docsvonest menesn25san GeorgievKovalik9:50 AM | Daily - Platform5:148 • Tue 14 Apr 9:50:38Chat6Nikolay YankovNikolay Ivanov...
|
NULL
|
|
8543
|
164
|
8
|
2026-04-14T06:51:18.848279+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149478848_m1.jpg...
|
Firefox
|
Jy 19798 evaluation for ai activity types by nikol Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet — Work...
|
1
|
github.com/jiminny/prophet/pull/468/changes#diff-d github.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8...
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Show more options
Show thinking
Gemini said
Gemini said
Yes,
there are several issues and code smells in this Pull Request,
ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues,
categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
,
the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one.
This means the script waits for one HTTP request to finish before starting the next,
completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app","depth":4,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":4,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":5,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":4,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":5,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Tab","depth":4,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Tabs from other devices","depth":6,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"WORK, Google Account: lukas.kovalik@jiminny.com","depth":12,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Chat","depth":12,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":21,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":22,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Gemini said","depth":20,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This Python script defines the","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator","depth":25,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Core Functionality","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Core Functionality","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Data Collection (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"collect_test_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"* Supports fetching raw samples from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Elasticsearch (ES)","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or loading them from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"local JSON files","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Implements a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"replayability check","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":": It probes the live API endpoint to ensure the activity exists before including it in the dataset.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Filters data based on request body size, language, and the presence of an original response.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Ground Truth Management:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Saves inputs and \"gold standard\" responses to a local directory structure (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"eval_output/ai_activity_type/...","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":").","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Features an optional mode to populate ground truth using a high-order LLM (e.g.,","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"gpt-5_2","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") if the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"POPULATE_GROUND_TRUTH_LLM","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"flag is set.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Evaluation Logic (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"evaluate_against_ground_truth","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Compares the model's predicted","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ai_activity_type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"against either the original ES response or the LLM-generated ground truth.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Assigns a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"quality_score","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"of","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a match and","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a mismatch or request failure.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Key Configuration Constants","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Key Configuration Constants","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy table","depth":24,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Execution Workflow","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Execution Workflow","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Parsing:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script uses","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"create_parser","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to handle CLI arguments like","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--data-set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--populate-ground-truth-llm","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", and","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--compare-to-ground-truth","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Initialization:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sets up directory paths for inputs and ground truths based on the selected dataset name.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Run:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Executes the evaluation via","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.run(run_evaluation(...))","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", which orchestrates the collection, endpoint calling, and result scoring phases.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said are there soem issues with this PR","depth":21,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"are there soem issues with this PR","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Show more options","depth":20,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":20,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes,","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"there are several issues and code smells in this Pull Request,","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ranging from performance bottlenecks to architectural anti-patterns.","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of the most notable issues,","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"categorized by impact:","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Performance & Async Anti-Patterns","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Performance & Async Anti-Patterns","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sequential Requests Defeating Async (Lines 86-106):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Inside","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"_collect_from_es","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the code loops through","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"raw_samples","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"await","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"s a","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"session.post","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"request for each sample one by one.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This means the script waits for one HTTP request to finish before starting the next,","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"completely defeating the purpose of using","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiohttp","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for asynchronous concurrency.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Collect these coroutines into a list and use","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4108366247784106262
|
8731758625087598999
|
idle
|
accessibility
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Show more options
Show thinking
Gemini said
Gemini said
Yes,
there are several issues and code smells in this Pull Request,
ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues,
categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
,
the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one.
This means the script waits for one HTTP request to finish before starting the next,
completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use...
|
NULL
|
|
8544
|
165
|
12
|
2026-04-14T06:51:22.047656+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149482047_m2.jpg...
|
Firefox
|
Jy 19798 evaluation for ai activity types by nikol Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet — Work...
|
1
|
github.com/jiminny/prophet/pull/468/changes#diff-d github.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8...
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Show more options
Show thinking
Gemini said
Gemini said
Yes,
there are several issues and code smells in this Pull Request,
ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues,
categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
,
the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one.
This means the script waits for one HTTP request to finish before starting the next,
completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use
asyncio.gather()
(ideally wrapped in an
asyncio.Semaphore
to avoid overwhelming the endpoint) to run the replayability probes concurrently.
Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):
Standard synchronous file operations (
with open(...)
,
json.dump
,
json.load
) are used inside
async def
functions.
In Python,
this blocks the entire event loop while the disk is reading/writing.
Fix:
Use an asynchronous file I/O library like
aiofiles
,
or run the blocking I/O operations in a separate thread using
asyncio.to_thread()...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app","depth":4,"bounds":{"left":0.00234375,"top":0.045138888,"width":0.0890625,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":4,"bounds":{"left":0.0,"top":0.08263889,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":5,"bounds":{"left":0.015625,"top":0.09236111,"width":0.11796875,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":4,"bounds":{"left":0.0,"top":0.11111111,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":5,"bounds":{"left":0.015625,"top":0.12083333,"width":0.1984375,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.07890625,"top":0.11736111,"width":0.009375,"height":0.016666668},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.003125,"top":0.14097223,"width":0.08710937,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.003125,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.01640625,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Tabs from other devices","depth":6,"bounds":{"left":0.029296875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.0421875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.05546875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.2171875,"top":0.047916666,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.23125,"top":0.047916666,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"WORK, Google Account: lukas.kovalik@jiminny.com","depth":12,"bounds":{"left":0.228125,"top":0.090277776,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.0984375,"top":0.090277776,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Chat","depth":12,"bounds":{"left":0.1953125,"top":0.090277776,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.2109375,"top":0.090277776,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.09335937,"top":0.12847222,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.09335937,"top":0.13055556,"width":0.14101562,"height":0.022222223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":21,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":22,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Gemini said","depth":20,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This Python script defines the","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator","depth":25,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Core Functionality","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Core Functionality","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Data Collection (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"collect_test_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"* Supports fetching raw samples from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Elasticsearch (ES)","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or loading them from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"local JSON files","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Implements a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"replayability check","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":": It probes the live API endpoint to ensure the activity exists before including it in the dataset.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Filters data based on request body size, language, and the presence of an original response.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Ground Truth Management:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Saves inputs and \"gold standard\" responses to a local directory structure (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"eval_output/ai_activity_type/...","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":").","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Features an optional mode to populate ground truth using a high-order LLM (e.g.,","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"gpt-5_2","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") if the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"POPULATE_GROUND_TRUTH_LLM","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"flag is set.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Evaluation Logic (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"evaluate_against_ground_truth","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Compares the model's predicted","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ai_activity_type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"against either the original ES response or the LLM-generated ground truth.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Assigns a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"quality_score","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"of","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a match and","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a mismatch or request failure.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Key Configuration Constants","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Key Configuration Constants","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy table","depth":24,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Execution Workflow","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Execution Workflow","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Parsing:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script uses","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"create_parser","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to handle CLI arguments like","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--data-set","depth":27,"bounds":{"left":0.1171875,"top":0.0,"width":0.107421875,"height":0.03125},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.12929687,"top":0.0,"width":0.003125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--populate-ground-truth-llm","depth":27,"bounds":{"left":0.13476562,"top":0.0,"width":0.08867188,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", and","depth":26,"bounds":{"left":0.1171875,"top":0.0,"width":0.11171875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--compare-to-ground-truth","depth":27,"bounds":{"left":0.13203125,"top":0.0,"width":0.08203125,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"bounds":{"left":0.21640626,"top":0.0,"width":0.0015625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Initialization:","depth":26,"bounds":{"left":0.1171875,"top":0.0,"width":0.039453126,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sets up directory paths for inputs and ground truths based on the selected dataset name.","depth":26,"bounds":{"left":0.1171875,"top":0.0,"width":0.11875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Run:","depth":26,"bounds":{"left":0.1171875,"top":0.024305556,"width":0.013671875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Executes the evaluation via","depth":26,"bounds":{"left":0.13085938,"top":0.024305556,"width":0.080078125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.run(run_evaluation(...))","depth":27,"bounds":{"left":0.11953125,"top":0.04375,"width":0.10507812,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", which orchestrates the collection, endpoint calling, and result scoring phases.","depth":26,"bounds":{"left":0.1171875,"top":0.04236111,"width":0.11289062,"height":0.06875},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.1203125,"top":0.15972222,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.1375,"top":0.15972222,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said are there soem issues with this PR","depth":21,"bounds":{"left":0.1609375,"top":0.16805555,"width":0.072265625,"height":0.03888889},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.09335937,"top":0.1701389,"width":0.0234375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"are there soem issues with this PR","depth":23,"bounds":{"left":0.1609375,"top":0.17083333,"width":0.062109374,"height":0.034027778},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Show more options","depth":20,"bounds":{"left":0.22382812,"top":0.22986111,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":26,"bounds":{"left":0.11953125,"top":0.23680556,"width":0.036328126,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":20,"bounds":{"left":0.11679687,"top":0.27291667,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"bounds":{"left":0.11679687,"top":0.27430555,"width":0.048046876,"height":0.016666668},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes,","depth":24,"bounds":{"left":0.10390625,"top":0.28125,"width":0.011328125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"there are several issues and code smells in this Pull Request,","depth":24,"bounds":{"left":0.10390625,"top":0.28125,"width":0.12890625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ranging from performance bottlenecks to architectural anti-patterns.","depth":24,"bounds":{"left":0.10390625,"top":0.29930556,"width":0.13242188,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of the most notable issues,","depth":24,"bounds":{"left":0.10390625,"top":0.34652779,"width":0.11640625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"categorized by impact:","depth":24,"bounds":{"left":0.123046875,"top":0.36458334,"width":0.06679688,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Performance & Async Anti-Patterns","depth":23,"bounds":{"left":0.10390625,"top":0.4,"width":0.13242188,"height":0.016666668},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Performance & Async Anti-Patterns","depth":24,"bounds":{"left":0.10390625,"top":0.40138888,"width":0.10859375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sequential Requests Defeating Async (Lines 86-106):","depth":26,"bounds":{"left":0.11757813,"top":0.42430556,"width":0.11601563,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Inside","depth":26,"bounds":{"left":0.16289063,"top":0.44236112,"width":0.019921875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"_collect_from_es","depth":27,"bounds":{"left":0.11992188,"top":0.46180555,"width":0.05234375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.17460938,"top":0.46041667,"width":0.001953125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the code loops through","depth":26,"bounds":{"left":0.11757813,"top":0.46041667,"width":0.10429688,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"raw_samples","depth":27,"bounds":{"left":0.14414063,"top":0.4798611,"width":0.0359375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and","depth":26,"bounds":{"left":0.18242188,"top":0.47847223,"width":0.0140625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"await","depth":27,"bounds":{"left":0.19882813,"top":0.4798611,"width":0.01640625,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"s a","depth":26,"bounds":{"left":0.21757813,"top":0.47847223,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"session.post","depth":27,"bounds":{"left":0.11992188,"top":0.49791667,"width":0.039453126,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"request for each sample one by one.","depth":26,"bounds":{"left":0.11757813,"top":0.4965278,"width":0.11640625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This means the script waits for one HTTP request to finish before starting the next,","depth":26,"bounds":{"left":0.11757813,"top":0.51458335,"width":0.11289062,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"completely defeating the purpose of using","depth":26,"bounds":{"left":0.11757813,"top":0.55069447,"width":0.11171875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiohttp","depth":27,"bounds":{"left":0.18046875,"top":0.5701389,"width":0.023046875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for asynchronous concurrency.","depth":26,"bounds":{"left":0.11757813,"top":0.56875,"width":0.09921875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"bounds":{"left":0.1296875,"top":0.61041665,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Collect these coroutines into a list and use","depth":28,"bounds":{"left":0.1296875,"top":0.61041665,"width":0.09921875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.gather()","depth":29,"bounds":{"left":0.16601562,"top":0.6298611,"width":0.05234375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(ideally wrapped in an","depth":28,"bounds":{"left":0.1296875,"top":0.6284722,"width":0.09257813,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.Semaphore","depth":29,"bounds":{"left":0.13203125,"top":0.66597223,"width":0.055859376,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to avoid overwhelming the endpoint) to run the replayability probes concurrently.","depth":28,"bounds":{"left":0.1296875,"top":0.6645833,"width":0.10546875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):","depth":26,"bounds":{"left":0.11757813,"top":0.72430557,"width":0.10976563,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Standard synchronous file operations (","depth":26,"bounds":{"left":0.11757813,"top":0.7423611,"width":0.107421875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"with open(...)","depth":27,"bounds":{"left":0.12226562,"top":0.7798611,"width":0.045703124,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.1703125,"top":0.77847224,"width":0.0015625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"json.dump","depth":27,"bounds":{"left":0.17617187,"top":0.7798611,"width":0.029296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.2078125,"top":0.77847224,"width":0.0015625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"json.load","depth":27,"bounds":{"left":0.11992188,"top":0.79791665,"width":0.0296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") are used inside","depth":26,"bounds":{"left":0.15195313,"top":0.7965278,"width":0.048046876,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"async def","depth":27,"bounds":{"left":0.20234375,"top":0.79791665,"width":0.0296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"functions.","depth":26,"bounds":{"left":0.11757813,"top":0.7965278,"width":0.11796875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"In Python,","depth":26,"bounds":{"left":0.14570312,"top":0.81458336,"width":0.030078124,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"this blocks the entire event loop while the disk is reading/writing.","depth":26,"bounds":{"left":0.11757813,"top":0.81458336,"width":0.11875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"bounds":{"left":0.1296875,"top":0.87430555,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Use an asynchronous file I/O library like","depth":28,"bounds":{"left":0.1296875,"top":0.87430555,"width":0.09335937,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiofiles","depth":29,"bounds":{"left":0.1625,"top":0.89375,"width":0.0265625,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":28,"bounds":{"left":0.19140625,"top":0.8923611,"width":0.0015625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or run the blocking I/O operations in a separate thread using","depth":28,"bounds":{"left":0.1296875,"top":0.8923611,"width":0.10585938,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.to_thread()","depth":29,"bounds":{"left":0.16914062,"top":0.9298611,"width":0.0625,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
2043951250885907759
|
9020551951192732151
|
visual_change
|
accessibility
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Show more options
Show thinking
Gemini said
Gemini said
Yes,
there are several issues and code smells in this Pull Request,
ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues,
categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
,
the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one.
This means the script waits for one HTTP request to finish before starting the next,
completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use
asyncio.gather()
(ideally wrapped in an
asyncio.Semaphore
to avoid overwhelming the endpoint) to run the replayability probes concurrently.
Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):
Standard synchronous file operations (
with open(...)
,
json.dump
,
json.load
) are used inside
async def
functions.
In Python,
this blocks the entire event loop while the disk is reading/writing.
Fix:
Use an asynchronous file I/O library like
aiofiles
,
or run the blocking I/O operations in a separate thread using
asyncio.to_thread()...
|
NULL
|
|
8547
|
165
|
14
|
2026-04-14T06:51:26.124695+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149486124_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...l1 OpenJy 19798 evaluation for ai activity types #468All commitsPlatform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval* endpoints.pyE eval-ai-activity-type.pyE .gitignore293294299are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating AsyncTuines ch-06nsioe_collect_ from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to runthe replayability probes concurrently.• Blocking File I/O in Async Functions(Lines 110-114, 184-188, 268-269):Standard synchronous file operations(with open(...), json. dump,json.load ) are used inside async def• Enter a prompt for Gemini+Pro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page329330 +‹-0ll • | Daily - Platform • nowA100% CD8 • Tue 14 Apr 9:51:26• 2 1 203 viewed• Awaltind aporoval+330 UUUUu• Viewedparser = create_parser(AiActivityTypeEvaluator)parser.aad arcument"_-data-set",lyoe-sutndetault="initial",dest="data_set",nelo= Vataser Toler uncer eval outvurral acuvity tviereval catasets, etault.initial)",parser.aad arcument"--populate-ground-truth-1lm",action="store_true",dest="populate_ground_truth_llm",parser.add araument"--ground-truth-llm-endpoint",type=str,default="OPENAI: gpt-5_2",dest="ground_truth_llm_endpoint",parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],default="es",dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_Lm)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))> .gitignore & ++4 0000• ViewedYou made it to the end!...
|
NULL
|
480115190393175185
|
NULL
|
click
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...l1 OpenJy 19798 evaluation for ai activity types #468All commitsPlatform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval* endpoints.pyE eval-ai-activity-type.pyE .gitignore293294299are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating AsyncTuines ch-06nsioe_collect_ from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to runthe replayability probes concurrently.• Blocking File I/O in Async Functions(Lines 110-114, 184-188, 268-269):Standard synchronous file operations(with open(...), json. dump,json.load ) are used inside async def• Enter a prompt for Gemini+Pro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page329330 +‹-0ll • | Daily - Platform • nowA100% CD8 • Tue 14 Apr 9:51:26• 2 1 203 viewed• Awaltind aporoval+330 UUUUu• Viewedparser = create_parser(AiActivityTypeEvaluator)parser.aad arcument"_-data-set",lyoe-sutndetault="initial",dest="data_set",nelo= Vataser Toler uncer eval outvurral acuvity tviereval catasets, etault.initial)",parser.aad arcument"--populate-ground-truth-1lm",action="store_true",dest="populate_ground_truth_llm",parser.add araument"--ground-truth-llm-endpoint",type=str,default="OPENAI: gpt-5_2",dest="ground_truth_llm_endpoint",parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],default="es",dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_Lm)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))> .gitignore & ++4 0000• ViewedYou made it to the end!...
|
NULL
|
|
8549
|
164
|
10
|
2026-04-14T06:51:30.388498+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149490388_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)M inbox (2) - nii22 Pluctorm SprlC 3y 19798 eva(l Service-DeskJr Foet AmesonxCost | Claude@ Actions secrC C Dependathttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/37• Sminny0 Projectslã AWsE SSHElDatadog * Claude5 CircieclA Sentry|8 TasksPlatform TeamQ Search board8800018EpicvTypevLabel vQuick filtersREADY FOR DEV1INOEV 4COOE REVIEW 3BLOCKED[Part? Automated reoortsexceed Fontawesomepackage limitsMASNTENANCISReady for Dev-205641П ****=AJ PEPORTSIn DevBE4-18400 115n****= 1Delays in CRM SyncCLATFORM STARUTTYIn DevQJy-20553workfiow for automatedDependabot fixesEMPROVEMENT OF OUR FIFFICHINGYIn Dev₴ J-2055120000Sru and sboize wodllelOS using now mobile ose...MASNTENANCSIn DevQAJy-20431Evaluation for AI ActivityTypesAUTO-OETICTED AETIVITY TYPGCode Review+ Create-J-Wies5 -•00=AJ Reports TrackingCode Review0 J-20543 | 11 •0• =for SSO for 'persistent'name,id.formatREDUCE CHURNCode ReviewД -л1-2063205 •*=+ Create9:51 AMDaily - Platform(ah)Daily - Platform - now8•• Actions sec:( Dependabot-Insights & Coachin.• DevDux.Compiete sprintPO ACCEPTANCEO Far Dependa:D Al BookmarksGrouo: QueriesDEPLOYHx oowntnte ouring teteature flag reieaseBUPPORT TICKETS0 -20585 15 + ****=AI Review - Q1|CTHEMATAOOCOeployed• -19982 1 @ 00.0=antity "Event"PLATTORM STARUITYDeployed© _20384 08 1 •*•*=Missed deletion of stale0 -20619 1 11 •..•:(Pipedrive) DI > status notlabeled, date data missingCEOUCT CKOR0 -200м 05 [1 ****=Steika StoyanovaSteliyan GeorgievKovalik6:068 • Tue 14 Apr 9:51:30ChatNikolay Yankov*Nikolay lvanov...
|
NULL
|
6827044167547302528
|
NULL
|
visual_change
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)M inbox (2) - nii22 Pluctorm SprlC 3y 19798 eva(l Service-DeskJr Foet AmesonxCost | Claude@ Actions secrC C Dependathttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/37• Sminny0 Projectslã AWsE SSHElDatadog * Claude5 CircieclA Sentry|8 TasksPlatform TeamQ Search board8800018EpicvTypevLabel vQuick filtersREADY FOR DEV1INOEV 4COOE REVIEW 3BLOCKED[Part? Automated reoortsexceed Fontawesomepackage limitsMASNTENANCISReady for Dev-205641П ****=AJ PEPORTSIn DevBE4-18400 115n****= 1Delays in CRM SyncCLATFORM STARUTTYIn DevQJy-20553workfiow for automatedDependabot fixesEMPROVEMENT OF OUR FIFFICHINGYIn Dev₴ J-2055120000Sru and sboize wodllelOS using now mobile ose...MASNTENANCSIn DevQAJy-20431Evaluation for AI ActivityTypesAUTO-OETICTED AETIVITY TYPGCode Review+ Create-J-Wies5 -•00=AJ Reports TrackingCode Review0 J-20543 | 11 •0• =for SSO for 'persistent'name,id.formatREDUCE CHURNCode ReviewД -л1-2063205 •*=+ Create9:51 AMDaily - Platform(ah)Daily - Platform - now8•• Actions sec:( Dependabot-Insights & Coachin.• DevDux.Compiete sprintPO ACCEPTANCEO Far Dependa:D Al BookmarksGrouo: QueriesDEPLOYHx oowntnte ouring teteature flag reieaseBUPPORT TICKETS0 -20585 15 + ****=AI Review - Q1|CTHEMATAOOCOeployed• -19982 1 @ 00.0=antity "Event"PLATTORM STARUITYDeployed© _20384 08 1 •*•*=Missed deletion of stale0 -20619 1 11 •..•:(Pipedrive) DI > status notlabeled, date data missingCEOUCT CKOR0 -200м 05 [1 ****=Steika StoyanovaSteliyan GeorgievKovalik6:068 • Tue 14 Apr 9:51:30ChatNikolay Yankov*Nikolay lvanov...
|
NULL
|
|
8550
|
165
|
16
|
2026-04-14T06:51:31.143027+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149491143_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...l1 OpenJy 19798 evaluation for ai activity types #468All commitsPlatform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval* endpoints.pyE eval-ai-activity-type.py#.gitignore293294299are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Asyncluines ch-06t nsiee_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to runthe replayability probes concurrently.• Blocking File I/O in Async Functions(Lines 110-114, 184-188, 268-269):Standard synchronous file operations(with open(...), json. dump,json.load ) are used inside async def• Enter a prompt for Gemini+Pro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page328329330 +‹-0ll • | Daily - Platform • nowA100% CD8 • Tue 14 Apr 9:51:31• 2 1 203 viewed•Awaltind aporova+330 UUUUu• Viewedparser = create_parser(AiActivityTypeEvaluator)parser.aad arcument"_-data-set",lyoe-sutndetault="initial",dest="data_set",nelo= Vataser Toler uncer eval outvurral acuvity tviereval catasets, etault.initial)",parser.aad arcument"--populate-ground-truth-1lm",action="store_true",dest="populate_ground_truth_llm",parser.add araument"--ground-truth-llm-endpoint",type=str,default="OPENAI: gpt-5_2",dest="qround truth llm endpoint".parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],default="es",dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_Lm)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))> .gitignore & ++4 0000• ViewedYou made it to the end!...
|
NULL
|
-3232519624691475072
|
NULL
|
visual_change
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...l1 OpenJy 19798 evaluation for ai activity types #468All commitsPlatform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval* endpoints.pyE eval-ai-activity-type.py#.gitignore293294299are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Asyncluines ch-06t nsiee_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to runthe replayability probes concurrently.• Blocking File I/O in Async Functions(Lines 110-114, 184-188, 268-269):Standard synchronous file operations(with open(...), json. dump,json.load ) are used inside async def• Enter a prompt for Gemini+Pro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page328329330 +‹-0ll • | Daily - Platform • nowA100% CD8 • Tue 14 Apr 9:51:31• 2 1 203 viewed•Awaltind aporova+330 UUUUu• Viewedparser = create_parser(AiActivityTypeEvaluator)parser.aad arcument"_-data-set",lyoe-sutndetault="initial",dest="data_set",nelo= Vataser Toler uncer eval outvurral acuvity tviereval catasets, etault.initial)",parser.aad arcument"--populate-ground-truth-1lm",action="store_true",dest="populate_ground_truth_llm",parser.add araument"--ground-truth-llm-endpoint",type=str,default="OPENAI: gpt-5_2",dest="qround truth llm endpoint".parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],default="es",dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_Lm)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))> .gitignore & ++4 0000• ViewedYou made it to the end!...
|
NULL
|
|
8552
|
164
|
12
|
2026-04-14T06:51:35.430819+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149495430_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:51
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:51","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.020833334,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.044444446,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.07986111,"top":0.90444446,"width":0.08125,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.07986111,"top":0.93777776,"width":0.08125,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
6522371285992393864
|
6113841948802480952
|
click
|
accessibility
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:51
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
NULL
|
|
8554
|
164
|
13
|
2026-04-14T06:51:36.069820+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149496069_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFileEditViewTabsBookmarksHistoryExtensionsWindo DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelp(ablDaily - Platform • now100% <78• Tue 14 Apr 9:51:35meet.google.com / Daily - PlatformNikolay Yankov (Presenting)Chat8•nbox (2) - nilPlatform TeamREADY FOR DEV 1oohesnetixWinexceed Fontawesomepackage limitsMASNTENANCSReady for Dev8-20564122 Pluctorm SprlC 3y 19798 eva(l Service-Desk xIr Foet AwesonCost | Claude) Actions secrC C Dependat• Actions sec:https://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-205430 Projects# AWS|E SSHEl Datadog* Claude3 CircieciSentryD TasksInsights & Coachin.800$ Jy-19240 / A JY-20543AJ Reports > TrackingDescriptionWe want to be able to track the usage of the AJ reports. We will use this to keep track of the adoption but also to useUserpilot tooltips to push users who are not using it to use it.• track each generated reports in DD - Include company name and frequency• for AJ reports - track each generated report in UserPilot as an event on the user - track it only for the user who hascreated the report• for Exec reports - track each generated report - set the tracking for each user in the non-jiminny participants listnote: for UP you can see how we currently track events such as Logged-activity, Held-conferenceSubtasksnousuueskLinked work itemsAos unkee wotk tenActivityAod a commentt.Suggest a reply…Status update…..CooeKowwyDetails@ Lukas KovallkReporter2 Galya DimitrovaUeyele enene@ Open with VS Code|1 branch4 commits1 pull request2 buildsComponentsPlatformSuoerooueAdd optionsNoneStory point estimate( Dependabot-• DevDux.O Far DependolD Al BookmarksXx Improve Story5 days agcUAEh9:51 AMDaily - PlatformGroup: Queriesleature fiag releaseDeployed0 -205e5 15 + ****=AI Review - Q1CNIHEVAORCEVIDeployed_19982 104000-No such column "Type' onentty 'Event"PLATTORM STABTLITYDeployed2038405 0800=Missed deletion of stale0 m40n78080(Pipedrive) DI > status notlabeled, date data missingDeployed0 -20614 05 П ****=:Stefka StoyanovaNikolay YankovSteliyan GeorgievNikolay IvanovLukas Kovalik6:11...
|
NULL
|
-6552454091515785628
|
NULL
|
click
|
ocr
|
NULL
|
DiaFileEditViewTabsBookmarksHistoryExtensionsWindo DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelp(ablDaily - Platform • now100% <78• Tue 14 Apr 9:51:35meet.google.com / Daily - PlatformNikolay Yankov (Presenting)Chat8•nbox (2) - nilPlatform TeamREADY FOR DEV 1oohesnetixWinexceed Fontawesomepackage limitsMASNTENANCSReady for Dev8-20564122 Pluctorm SprlC 3y 19798 eva(l Service-Desk xIr Foet AwesonCost | Claude) Actions secrC C Dependat• Actions sec:https://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-205430 Projects# AWS|E SSHEl Datadog* Claude3 CircieciSentryD TasksInsights & Coachin.800$ Jy-19240 / A JY-20543AJ Reports > TrackingDescriptionWe want to be able to track the usage of the AJ reports. We will use this to keep track of the adoption but also to useUserpilot tooltips to push users who are not using it to use it.• track each generated reports in DD - Include company name and frequency• for AJ reports - track each generated report in UserPilot as an event on the user - track it only for the user who hascreated the report• for Exec reports - track each generated report - set the tracking for each user in the non-jiminny participants listnote: for UP you can see how we currently track events such as Logged-activity, Held-conferenceSubtasksnousuueskLinked work itemsAos unkee wotk tenActivityAod a commentt.Suggest a reply…Status update…..CooeKowwyDetails@ Lukas KovallkReporter2 Galya DimitrovaUeyele enene@ Open with VS Code|1 branch4 commits1 pull request2 buildsComponentsPlatformSuoerooueAdd optionsNoneStory point estimate( Dependabot-• DevDux.O Far DependolD Al BookmarksXx Improve Story5 days agcUAEh9:51 AMDaily - PlatformGroup: Queriesleature fiag releaseDeployed0 -205e5 15 + ****=AI Review - Q1CNIHEVAORCEVIDeployed_19982 104000-No such column "Type' onentty 'Event"PLATTORM STABTLITYDeployed2038405 0800=Missed deletion of stale0 m40n78080(Pipedrive) DI > status notlabeled, date data missingDeployed0 -20614 05 П ****=:Stefka StoyanovaNikolay YankovSteliyan GeorgievNikolay IvanovLukas Kovalik6:11...
|
NULL
|
|
8555
|
165
|
18
|
2026-04-14T06:51:36.173860+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149496173_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 eval Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...soccce Cdcsediie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringolcses.are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio.Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vl1 OpenJy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval= enapolnts.pyE eval-ai-activity-type.pyE .gitignoreYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page>.gitignore ?293299+329330< 40 lhl | Daily - Platform • now100% [42' 8• Tue 14 Apr 9:51:36• 2 1 203 viewed• Awallind aporovall+330 UUUUU |Viewednarser = create narser AiActivitvTvoervaluator)parser.add_argument("--data-set",lyoe-sutndefault="initial",dest="data_set",helo="Dataset folder under eval outout/ai activity tvoe/eval datasets/ (default:initial)",parser.add_argument("--populate-ground-truth-Ulm",action="store_true",dest="populate_ground_truth_llm",parser.add_argument("_-ground-truth-llm-endpoint",nvoessur.default="OPENAI: gpt-5_2",dest="ground_truth_llm_endpoint",parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],derault=es,dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUNd_TRUTH_LLM = bool(args.populate_ground_truth_(m)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))+4 0000• Viewed••You made it to the end!...
|
NULL
|
2855163612701406840
|
NULL
|
click
|
ocr
|
NULL
|
Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 eval Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...soccce Cdcsediie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringolcses.are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio.Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vl1 OpenJy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval= enapolnts.pyE eval-ai-activity-type.pyE .gitignoreYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page>.gitignore ?293299+329330< 40 lhl | Daily - Platform • now100% [42' 8• Tue 14 Apr 9:51:36• 2 1 203 viewed• Awallind aporovall+330 UUUUU |Viewednarser = create narser AiActivitvTvoervaluator)parser.add_argument("--data-set",lyoe-sutndefault="initial",dest="data_set",helo="Dataset folder under eval outout/ai activity tvoe/eval datasets/ (default:initial)",parser.add_argument("--populate-ground-truth-Ulm",action="store_true",dest="populate_ground_truth_llm",parser.add_argument("_-ground-truth-llm-endpoint",nvoessur.default="OPENAI: gpt-5_2",dest="ground_truth_llm_endpoint",parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],derault=es,dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUNd_TRUTH_LLM = bool(args.populate_ground_truth_(m)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))+4 0000• Viewed••You made it to the end!...
|
NULL
|
|
8558
|
165
|
20
|
2026-04-14T06:52:07.796922+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149527796_m2.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.068750024},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.06944442},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.7296875,"top":1.0,"width":0.023046875,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.7453125,"top":1.0,"width":0.002734375,"height":-0.068750024},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.7558594,"top":1.0,"width":0.0140625,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.75742185,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.7746094,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.7734375,"top":1.0,"width":0.01328125,"height":-0.0625},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:52","depth":10,"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off microphone","depth":14,"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is on.","depth":4,"bounds":{"left":0.23554687,"top":1.0,"width":0.000390625,"height":-0.050694466},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.2957031,"top":1.0,"width":0.12148438,"height":-0.02569449},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
-4590837745933313988
|
-9169124920219936240
|
idle
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform
Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...soccce Cdcsediie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vl1 OpenJy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval= enapolnts.pyE eval-ai-activity-type.pyE .gitignoreYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page>.gitignore ?293299+329330< 40 lhl | Daily - Platform • now100% [42' 8• Tue 14 Apr 9:52:07• 2 / 203 viewed•Awallind aporovall+330 UUUUU |Viewednarser = create narser AiActivitvTvoervaluator)parser.add_argument("--data-set",lyoe-sulndefault="initial,dest="data_set",helo="Dataset folder under eval outout/ai activity tvoe/eval datasets/ (default:initial)",parser.add_argument("--populate-ground-truth-Ulm",action="store_true",dest="populate_ground_truth_llm",parser.add_argument("_-ground-truth-llm-endpoint",nvoessur.default="OPENAI: gpt-5_2",dest="ground_truth_llm_endpoint",parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],derault=es,dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_(m)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))+4 0000• Viewed••You made it to the end!...
|
NULL
|
|
8559
|
164
|
15
|
2026-04-14T06:52:10.141659+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149530141_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"bounds":{"left":0.3263889,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"bounds":{"left":0.35416666,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.38472223,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"bounds":{"left":0.81041664,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"bounds":{"left":0.90902776,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.93958336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"bounds":{"left":0.81041664,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.90902776,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"bounds":{"left":0.93958336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"bounds":{"left":0.81527776,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"bounds":{"left":0.84583336,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"bounds":{"left":0.8763889,"top":0.76222223,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:52","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022222223,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.046527777,"top":0.93777776,"width":0.017361112,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08125,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08125,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is on.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
-4590837745933313988
|
-9169124920219936240
|
click
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform
DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelp(ablDaily - Platform • nowmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)100% <47•8• Tue 14 Apr 9:52:09Chat68•nbox (2) - nil22 Pluctorm SprlC 3y 19798 eva(l Service-Desk xIr Foet AwesonCost | Claude) Actions secrC C Dependat• Actions sec:( Dependabot-https://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-205430 ProjectsB AWSE3 SSHEl Datadog* Claude3 CircieciSentryD TasksInsights & Coachin.• DevPlatform Team800$ Jy-19240 / A JY-20543READY FOR DEV 1AJ Reports > TrackingoohesnetixWinexceed Fontawesomepackage limitsMASNTENANCSReady for Dev8-205641DescriptionWe want to be able to track the usage of the AJ reports. We will use this to keep track of the adoption but also to useUserpilot tooltips to push users who are not using it to use it.• track each generated reports in DD - Include company name and frequency• for AJ reports - track each generated report in UserPilot as an event on the user - track it only for the user who hascreated the report• for Exec reports - track each generated report - set the tracking for each user in the non-jiminny participants listnote: for UP you can see how we currently track events such as Logged-activity, Held-conferenceSubtasksnousuueskLinked work itemsAos unkee wotk tenActivityAod a commentt.Suggest a reply…Status update...Dux.CooeKowwyXx Improve StoryDetails@ Lukas KovallkReporter2 Galya DimitrovaUeyele enene@ Open with VS Code|1 branch4 commits1 pull request2 builds5 days agcUAEhComponentsPlatformSuoerooueAdd optionsNoneStory point estimateO Far DependolD Al BookmarksGroup: Queriesleature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNIHEVAORCEVIDeployed_19982No such column "Type' onentty 'Event"PLATTORM STARTLITYDeployed2038405 0000=Missed deletion of stale0 m400(Pipedrive) DI > status notlabeled, date data missingDeployed0 -20614 05 П ****=Stefka StoyanovaNikolay YankovSteliyan Georgiev,Nikolay Ivanovfalik9:52 AM Daily - Platform:6:45...
|
NULL
|
|
8562
|
164
|
17
|
2026-04-14T06:52:19.090750+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149539090_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"bounds":{"left":0.3263889,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"bounds":{"left":0.35416666,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.38472223,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"bounds":{"left":0.81041664,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"bounds":{"left":0.90902776,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.93958336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"bounds":{"left":0.81041664,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.90902776,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"bounds":{"left":0.93958336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"bounds":{"left":0.81527776,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"bounds":{"left":0.84583336,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"bounds":{"left":0.8763889,"top":0.76222223,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:52","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022222223,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.046527777,"top":0.93777776,"width":0.017361112,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08125,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08125,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is on.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
-4590837745933313988
|
-9169124920219936240
|
visual_change
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform
DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelp(ablDaily - Platform • now100% [8• Tue 14 Apr 9:52:18meet.google.com / Daily - PlatformNikolay Yankov (Presenting)Chat8•nbox (2) - nilPlatform TeamQ Search boardREADY FOR DEV 1oohesnetixWinexceed Fontawesomepackage limitsMASNTENANCSReady for Dev@-205641 П ****:22 Pluctorm SprlC 3y 19798 eva(l Service-Desk xIr Foet AwesonCost | Claude) Actions secrC C Dependathttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-205430 Projects# AWS|E SSHE Datados0 • * Ciaudel3 CircieciSentry8 Tasks800$ Jy-19240 / A JY-20543AJ Reports > TrackingDescriptionWe want to be able to track the usage of the AJ reports. We will use this to keep track of the adoption but also to useUserpilos tooiops to pusn users wno are hot usino i to use ic• track each generated reports in DD - Include company name and frequency• for AJ reports - track each generated report in UserPilot as an event on the user - track it only for the user who hascreated the report• for Exec reports - track each generated report - set the tracking for each user in the non-jiminny participants listnote: for UP you can see how we currently track events such as Logged-activity, Held-conferenceSubtasksnousuoes%Linked work itemsAos unkee wotk tenActivityAoda commenttSuggest a reply…Status update...Uro bpc press M loo• Actions sec:Insights & Coachin.( Dependabot-• DevDux.O Far DependolD Al BookmarksCooeKowewyDetails@ Lukas KovallkReporter2 Galya DimitrovaDevelopment #@ Open with VS Code|1 branch4 commits1 pull request2 buildsComponentsPlatformSuoerooueAdd optionsNoneStory point estimateXx Improve Story5 days agcOMEN C +Group: Queriesleature fiag releaseDeployed0 -205e5 15 + ****=•AI Review - Q1CTNTHEVANAORCEVDeployed19982wo sucn coumh "lype onentty 'Event"PLATTORM STARTLITYDeployed2038405 0000=Missed deletion of stale0 m400(Pipedrive) DI > status notlabeled, date data missingDeployed0 -20614 05 П ****=Stefka StoyanovaNikolay YankovSteliyan GeorgievNikolay IvanovValik9:52 AM Daily - Platform:6:54...
|
NULL
|
|
8564
|
165
|
22
|
2026-04-14T06:52:40.961202+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149560961_m2.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.068750024},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.06944442},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.7296875,"top":1.0,"width":0.023046875,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.7453125,"top":1.0,"width":0.002734375,"height":-0.068750024},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.7558594,"top":1.0,"width":0.0140625,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.75742185,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.7746094,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.7734375,"top":1.0,"width":0.01328125,"height":-0.0625},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:52","depth":10,"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off microphone","depth":14,"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is on.","depth":4,"bounds":{"left":0.23554687,"top":1.0,"width":0.000390625,"height":-0.050694466},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.2957031,"top":1.0,"width":0.12148438,"height":-0.02569449},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
-4590837745933313988
|
-9169124920219936240
|
click
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform
Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...soccce Cdcsediie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vl1 OpenJy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval= enapolnts.pyE eval-ai-activity-type.pyE .gitignoreYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page>.gitignore ?293299+329330< 40 lhl | Daily - Platform • now100% (' • Tue 14 Apr 9:52:40• 2 / 203 viewed•Awallind aporovali+330 UUUUU |Viewednarser = create narser AiActivitvTvoervaluator)parser.add_argument("--data-set",lyoe-sulndefault="initial,dest="data_set",helo="Dataset folder under eval outout/ai activity tvoe/eval datasets/ (default:initial)",parser.add_argument("--populate-ground-truth-Ulm",action="store_true",dest="populate_ground_truth_llm",parser.add_argument("_-ground-truth-llm-endpoint",nvoessur.default="OPENAI: gpt-5_2",dest="ground_truth_llm_endpoint",parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],derault=es,dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_(m)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))+4 0000• Viewed••You made it to the end!...
|
NULL
|
|
8565
|
164
|
19
|
2026-04-14T06:52:48.026924+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149568026_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"bounds":{"left":0.3263889,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"bounds":{"left":0.35416666,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.38472223,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"bounds":{"left":0.81041664,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"bounds":{"left":0.90902776,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.93958336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"bounds":{"left":0.81041664,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.90902776,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"bounds":{"left":0.93958336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"bounds":{"left":0.81527776,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"bounds":{"left":0.84583336,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"bounds":{"left":0.8763889,"top":0.76222223,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:52","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022222223,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.046527777,"top":0.93777776,"width":0.017361112,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08125,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08125,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
2540129927862717858
|
-9164621329144752104
|
click
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform
DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelp(ablDaily - Platform • nowmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)100% <478• Tue 14 Apr 9:52:47Chat68•Platform TeamREADY FOR DEV 1doesnetixWiwexceed Fontawesomepackage limitsMAENTENANCISReady for Dev8-205641 П ****=G 3y 19798 eva(l Service-Desk xJk Foet Aweson• Cost | Claude) Actions secrC C Dependat•) Actions secr( Dependabot: xO Far Dependolsny.atlassian.net/jira/software/c/projects/JY/boards/37?selectedissue=JY-205430 ProjectsBl AWSE3 SSHE Datados0 C * Claudel3 Circieci[ Sentry8 Tasks• Del800$ Jy-19240 / A JY-20543AJ Reports > Tracking~ DescriptionWe want to be able to track the usage of the AJ reports. We will use this to keep track of the adoption but also to useUserpilos tooiops to pusn users wno are not usino i to use ic• track each generated reports in DD - Include company name and frequency• for AJ reports - track each generated report in UserPilot as an event on the user - track it only for the user who hascreated the report• for Exec reports - track each generated report - set the tracking for each user in the non-jiminny participants list |note: for UP you can see how we currenty track events such as Logged-activity, Held-conferenceSubtasksnousuueskLinked work itemsAod unkee work erActivityAoda commenttSuggest a reply.Status update...Uro bpc press M looInsights & Coachin.D ux.@21Cooe KowwwyXZ Improve Story~ Details@ Lukas KovallkReporter2 Galya DimitrovaUerele enene@ Open with VS Code|1 branch4 commits1 pull request2 builds5 days agcUrEhComponentsPlatformSuoerooueAdd optionsNoneStory point estimateD Al Bookmarks9:52 AM Daily - PlatformGroup: QueriesDEPLOY &Puooanonoeleature fiag reieaseDeployed0 -20505 15 + ****=•AI Review - Q1CTNIHEWANAORCOVDeployed_19982o e00=wo sucn co umh "lype onentty 'Event"PLATTORM STARTUITYDeployed2038405 0600=Missed deletion of stale0 -400R408009(Pipedrive) DI > status notlabeled, date data missingDeployed0 -20614 05 П ****=:Stefka StoyanovaNikolay YankovSteliyan GeorgievNikolay Ivanovvalik•7:23...
|
NULL
|
|
8568
|
165
|
24
|
2026-04-14T06:52:57.830205+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149577830_m2.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.068750024},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.06944442},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.7296875,"top":1.0,"width":0.023046875,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.7453125,"top":1.0,"width":0.002734375,"height":-0.068750024},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.7558594,"top":1.0,"width":0.0140625,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.75742185,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.7746094,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.7734375,"top":1.0,"width":0.01328125,"height":-0.0625},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:52","depth":10,"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off microphone","depth":14,"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.2957031,"top":1.0,"width":0.12148438,"height":-0.02569449},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
420223797619469401
|
-9169124928809870832
|
click
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:52
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
meet.google.com / Daily - Platform
Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...soccce Cdcsediie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vl1 OpenJy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval= enapolnts.pyE eval-ai-activity-type.pyE .gitignoreYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page>.gitignore ?293299+329330< 40 lhl | Daily - Platform • now100% [42' 8• Tue 14 Apr 9:52:57• 2 / 203 viewed•Awallind aporovall+330 UUUUU |Viewednarser = create narser AiActivitvTvoervaluator)parser.add_argument("--data-set",lyoe-sulndefault="initial,dest="data_set",helo="Dataset folder under eval outout/ai activity tvoe/eval datasets/ (default:initial)",parser.add_argument("--populate-ground-truth-Ulm",action="store_true",dest="populate_ground_truth_llm",parser.add_argument("_-ground-truth-llm-endpoint",nvoessur.default="OPENAI: gpt-5_2",dest="ground_truth_llm_endpoint",parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],derault=es,dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_(m)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))+4 0000• Viewed••You made it to the end!...
|
NULL
|
|
8569
|
164
|
21
|
2026-04-14T06:53:01.581073+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149581581_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:53
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"bounds":{"left":0.3263889,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"bounds":{"left":0.35416666,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.38472223,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"bounds":{"left":0.81041664,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"bounds":{"left":0.90902776,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.93958336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"bounds":{"left":0.81041664,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.90902776,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"bounds":{"left":0.93958336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"bounds":{"left":0.81527776,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"bounds":{"left":0.84583336,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"bounds":{"left":0.8763889,"top":0.76222223,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:53","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022222223,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.046527777,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08125,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08125,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
6850664184897375859
|
-9164621329144752104
|
visual_change
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:53
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform
DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelp(abl| Daily - Platform • nowmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)100% [8• Tue 14 Apr 9:53:01Chat68108•Inbox (2) - n22 Plactorm SprlG Jy 19798 eval(l Service-Desk xJr Foet Aweson• Cost | Claud x@ Actions secreC C: Dependat• Actions secr( Dependabot: xhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/372selectedissue=JY-20632• Sminn• ProjectsE SSHPlatform TeamQ Search board2800018READY FOR DEV 1INDEV 4Part? Automated reoorts)exceed FontawesomeMASNTENANESReady for Dev8-205641AJREPORTSIn DevBEFEA -18900 115n ****= 3Delays in CRM SyncCUATFORM STABTUITNIn DevQJy-20553workfiow for automatedDependabot fixesEMPROVEMENT OF OUR EFFICEINCYIn Dev©J-205512 00005ru and swaoiize wodieOS using new mobile use...MAINTENANCEIn DevQAД -37-20431El Dacadog © * Claude2 CircleclSentry0 Tasks)Insights & Coachin.• Dev• UXLabel vQuick filtersComplete sprintCOOE REVIEW 3BLOCKEDPO ACCEPTANCEEvaluation for AI ActivityTypesAUTO-DETECTED ACTIVITY TYPECode ReviewJ1-17085 -•00=AJ Reports > TrackingCode ReviewД 3-20543 1for SSO for 'persistent'name_id_format (REDUCE CHURNCode ReviewД лл-206320,5 •.• =+ CreateO Far Dependo:D Al BookmarksGroup: QueriesDEPLOY &nix cownome ouring laleature fiag releaseBUPPORT TICKETSDeployed2036515 •000=AI Review - Q1Oeployed0 -19982|1 • •=*=no sucn coumn "lype onentity "Event"PLATTORM STABTUTTYDeployed-20384 05 П ••*•=|Missed deletion of staleREDUCE CHURN0 1200n(Pipedrive) DI > status not|labeled, date data missingREOUCE CHURNDeployed0 -20614 05 П ****=Stefka StoyanovaNikolay YankovSteliyan GeorgievNikolay IvanovKovalik9:53 AM Daily - Platform• 7:37...
|
NULL
|
|
8572
|
164
|
23
|
2026-04-14T06:53:37.920446+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149617920_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:53
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:53","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022222223,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.046527777,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08125,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08125,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
4595186536147244130
|
6113912317546658616
|
visual_change
|
accessibility
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:53
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
NULL
|
|
8574
|
164
|
25
|
2026-04-14T06:53:50.015198+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149630015_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:53
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:53","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022222223,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.046527777,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08125,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08125,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
4595186536147244130
|
6113912317546658616
|
visual_change
|
accessibility
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:53
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
NULL
|
|
8575
|
165
|
26
|
2026-04-14T06:53:59.033351+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149639033_m2.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:53
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.068750024},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.06944442},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.7296875,"top":1.0,"width":0.023046875,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.7453125,"top":1.0,"width":0.002734375,"height":-0.068750024},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.7558594,"top":1.0,"width":0.0140625,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.75742185,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.7746094,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.7734375,"top":1.0,"width":0.01328125,"height":-0.0625},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"role_description":"text"},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"role_description":"text"},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:53","depth":10,"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.23554687,"top":1.0,"width":0.000390625,"height":-0.050694466},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.2957031,"top":1.0,"width":0.12148438,"height":-0.02569449},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
4595186536147244130
|
6113912317546658616
|
idle
|
accessibility
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:53
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
NULL
|
|
8577
|
164
|
27
|
2026-04-14T06:54:23.348778+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149663348_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
Mute Stefka Stoyanova's microphone
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"bounds":{"left":0.3263889,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"bounds":{"left":0.35416666,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.38472223,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Stefka Stoyanova's microphone","depth":15,"bounds":{"left":0.7798611,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"bounds":{"left":0.81041664,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"bounds":{"left":0.90902776,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.93958336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"bounds":{"left":0.81041664,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.90902776,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"bounds":{"left":0.93958336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"bounds":{"left":0.81527776,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"bounds":{"left":0.84583336,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"bounds":{"left":0.8763889,"top":0.76222223,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:54","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022916667,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.047222223,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08194444,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08194444,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
5783925886901743235
|
-4552651086961583336
|
visual_change
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
Mute Stefka Stoyanova's microphone
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform
DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)E:1nbox (2)SRD-677C Jy 19798(7 Service-oir Font. Amr• COMTICHhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/372selectedissue=JY-20632Simir0 ProjectsB AWSE3 SSHE Datadog3 CircieciPlatform Team800$ 5y-12250 / [ Jy-20632READY FOR DEV 1Prepare fallback with email for SSO for 'persistent name_id_formathoohesnetixWiwexceed Fontawesomepackage limitsMAINTENANCISReady for Dev©-20564 1 П ****=DescriptionEdit descriptionSubtasksAdd subtaskLinked work items0 SRD-6779 Unable to log in to SidelickIN PROGRESS MActivityWOrK lOAdd a commentCan I get more info.?Status update...Fro bpc press M lo comment)9:54 AM Daily - PlatformActions s(abl| Daily - Platform • now100% <478• Tue 14 Apr 9:54:23Chat8•С Ct: DepeActions sDependol• Far DepeD TasksInsights & Coachin• DevD ux.Code Keviw y# Improve Story~ DetailsNikolay NikolovAssign to meKeoohNikolay NikolovWweoenieQ Open with VS Code}J Create branch12 commitsl1 pull requestouikeComponentsPlatformSub-ProductSidekickLabelsNonelStory point estimateNoneOPEN CGroup: Queriesteature fiag reieaseSUPPORT TICKETSDeployed1 2006515 * •000=AI Review - Q1CNIHEWANAORCEVDeployed_190821o de0=No such column "Type' onentity 'Event"PLATTORM STARTLTTYDeployed8 -20384 05 11 ****=|Missed deletion of stale0 -40003400000(Pipedrive) DI > status notlabeled, date data missingREOUCT CHURNNDeployed0 -20614 05 П ****=:D Al BookmarksStefka StoyanovalNikolay Yankovan GeorgievNikolay IvanovkLukas Kovalik8:58...
|
NULL
|
|
8580
|
164
|
29
|
2026-04-14T06:54:34.054266+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149674054_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
Mute Stefka Stoyanova's microphone
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"bounds":{"left":0.3263889,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"bounds":{"left":0.35416666,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.38472223,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Stefka Stoyanova's microphone","depth":15,"bounds":{"left":0.7798611,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"bounds":{"left":0.81041664,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"bounds":{"left":0.90902776,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.93958336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"bounds":{"left":0.81041664,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.90902776,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"bounds":{"left":0.93958336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"bounds":{"left":0.81527776,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"bounds":{"left":0.84583336,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"bounds":{"left":0.8763889,"top":0.76222223,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:54","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022916667,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.047222223,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08194444,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08194444,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is on.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
1901171151674562519
|
-3399940680001529064
|
click
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
Mute Stefka Stoyanova's microphone
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform
DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)E:1nbox (2)SRD-677C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/372selectedissue=JY-20632Simir0 ProjectsB AWSE3 SSHE Datadog* Claude3 CircieciPlatform Team800$ 5y-12250 / [ Jy-20632READY FOR DEV 1Prepare fallback with email for SSO for 'persistent name_id_formatoohesnetxWiwexceed Fontawesomepackage limitsMAINTENANCISReady for Dev©-20564 1 П ****=DescriptionEdit descriptionSubtasksAdd subtaskLinked work items0 SRD-6779 Unable to log in to SidekickIN PROGRESS MActivityWOrK lOAod a comment.Can I get more info.?Status update...Fro bpc press M lo comment)9:54 AM Daily - PlatformActions s(ablDaily - Platform • now100% <478• Tue 14 Apr 9:54:336ChatС Ct: DepeActions sDependolD TasksInsights & Coachin.• DelD ux.IZ Improve Story~ DetailsNikolay NikolovAsuntoeReporter• Nikolay NikolovUerelo enecne@ Open with VS Code|) Create branch12 commits1 pull requestouikkComponentsPlatformSuoerooueSidekickLabelsNoneStory point estimate5 days agcOPEN C8•• Far DepeD Al BookmarksGroup: QueriesSUPPORT TICKETSDeployed0E -20505 15 + ****=AI Review - Q1CTNIEUANAORCEVDeployed_19082 1o de0=No such column "Type' onentty 'Event"PLATTORM STARTLITYDeployed© -20384 05 11 ****=Missed deletion of stale0 -40001a8ese(Pipedrive) DI > status notlabeled, date data missingREOUCT CHURNNDeployed0 -20614 05 П ****=:Stefka StoyanovaNikolay YankovSteliyan GeorgievNikolay Ivanov9:09...
|
NULL
|
|
8581
|
165
|
28
|
2026-04-14T06:54:34.041166+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149674041_m2.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
Mute Stefka Stoyanova's microphone
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.068750024},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.06944442},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.7296875,"top":1.0,"width":0.023046875,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.7453125,"top":1.0,"width":0.002734375,"height":-0.068750024},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.7558594,"top":1.0,"width":0.0140625,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.75742185,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.7746094,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.7734375,"top":1.0,"width":0.01328125,"height":-0.0625},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Stefka Stoyanova's microphone","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:54","depth":10,"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off microphone","depth":14,"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is on.","depth":4,"bounds":{"left":0.23554687,"top":1.0,"width":0.000390625,"height":-0.050694466},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.2957031,"top":1.0,"width":0.12148438,"height":-0.02569449},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
1901171151674562519
|
-3399940680001529064
|
click
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
Mute Stefka Stoyanova's microphone
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn off microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is on.
meet.google.com / Daily - Platform
Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activ+ New labHistoryBookmarksProfilesToolsWindow Helpgithub.com/jiminny/prophet/pull/468/changes#diff-d2f9fea9bc428b119ae26a66453e8eab7dd0782d4a4d93075ca5ce05903d77e8~ Google Gemini...soccce Cdcsediie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vl1 OpenJy 19798 evaluation for ai activity types #468All commitsnikolaybiaivanov wants to merge 9 commits into master from JY-19798-evaluation-for-ai-activity-types [v src/eval/eval-ai-activity-type.py SQ Filter files...• E eval_output/ai_activity_type/eval…..> • ground_truth>I inoutE src/eval= enapolnts.pyE eval-ai-activity-type.pyE .gitignoreYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page>.gitignore ?293299+329330< 40 lhl | Daily - Platform • now100% [42' 8• Tue 14 Apr 9:54:33• 2 1 203 viewed•Awallind aporovall+330 UUUUU |Viewednarser = create narser AiActivitvTvoervaluator)parser.add_argument("--data-set",lyoe-sulndefault="initial,dest="data_set",helo="Dataset folder under eval outout/ai activity tvoe/eval datasets/ (default:initial)",parser.add_argument("--populate-ground-truth-Ulm",action="store_true",dest="populate_ground_truth_llm",parser.add_argument("_-ground-truth-llm-endpoint",nvoessur.default="OPENAI: gpt-5_2",dest="ground_truth_llm_endpoint",parser.add_argument("--compare-to-ground-truth",type=str,choices=["es", "1lm"],derault=es,dest="compare_to_ground_truth",aros = parser.parse arasAiActivityTypeEvaluator.DATA_SET = args.data_setA1Activitylypeevaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_(m)AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpointAiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truthasyncio.run(run_evaluation(AiActivityTypeEvaluator, args))+4 0000• Viewed••You made it to the end!...
|
NULL
|
|
8583
|
164
|
31
|
2026-04-14T06:54:38.407318+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149678407_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"bounds":{"left":0.3263889,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"bounds":{"left":0.35416666,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.38472223,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"bounds":{"left":0.81041664,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"bounds":{"left":0.90902776,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.93958336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"bounds":{"left":0.81041664,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.90902776,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"bounds":{"left":0.93958336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"bounds":{"left":0.81527776,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"bounds":{"left":0.84583336,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"bounds":{"left":0.8763889,"top":0.76222223,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:54","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022916667,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.047222223,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08194444,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08194444,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
2366998612440519548
|
-4552934760961549800
|
visual_change
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
You can't unmute someone else
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform
DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelp(abl| Daily - Platform • now100% <478• Tue 14 Apr 9:54:38meet.google.com / Daily - PlatformNikolay Yankov (Presenting)Chat8•Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oie Font. Anc• COMTICH• Actions :С Ct: DepeQ Actions :Dependolhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/37• Sminn• ProjectiE SSHPlatform TeamQ Search board2800018READY FOR DEV 1IN DEV 4oeeretxWirexceed FontawesomeMASNTENANCSReady for Dev8-205641'*aa*Part? Automated reoorts)with Ask JiminnyAJREPORTSIn DevBE4-16400 115n****= 3Delays in CRM Sync|CUATFORM STABTUITYIn DevQJy-20553workfiow for automatedDependabot fixesEMPROVEMENT OF OUR EFFICHINCYIn Dev© 3-20551ru and swoiize wodieOS using new mobile use...MAINTENANCEIn DevOAД -37-20431El Dacadog * Claude3 CircieciA Sentryl0 TasksInsights & Coachin.• DevD ux.Label vQuick filtersComplete sprintCOOE REVIEW 3BLOCKEDPO ACCEPTANCEEvaluation for AI ActivityTypesAUTO-DETECTED ACTIVITY TYPECode Review9л-19798 15 + **•=AJ Reports > TrackingCode ReviewД J-20543 / 11 •*• =for SSO for 'persistent'name_id_formatREDUCE CHURNCode ReviewД 31-206320,5 ..• =• Far DepeD Al BookmarksGroup: QueriesDEPLOY &nx oownome during teleature fiag releaseBUPPORT TICKETSDeployed0 -20565 15 ****=AI Review - Q1Oeployed• -19982 1 @ •000=|no sucn column "lype onentity "Event"PLATFORM STARTUITYDeployed9 _20384 05 П •**•=|Missed deletion of staleREDUCE CHURN0 20619 1 11 ****=(Pipedrive) DI > status notStefka StoyanovaNikolay Yankovyan GeorgievNikolay lvanovLovalik9:54 AM Daily - Platform9:13...
|
NULL
|
|
8585
|
164
|
33
|
2026-04-14T06:54:47.505087+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149687505_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
Mute Steliyan Georgiev's microphone
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"bounds":{"left":0.3263889,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"bounds":{"left":0.35416666,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.38472223,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.7798611,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"bounds":{"left":0.81041664,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"bounds":{"left":0.90902776,"top":0.25555557,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"bounds":{"left":0.93958336,"top":0.25777778,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"bounds":{"left":0.75208336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Steliyan Georgiev's microphone","depth":15,"bounds":{"left":0.7798611,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"bounds":{"left":0.81041664,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"bounds":{"left":0.88125,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"bounds":{"left":0.90902776,"top":0.50777775,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"bounds":{"left":0.93958336,"top":0.51,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"bounds":{"left":0.81527776,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"bounds":{"left":0.84583336,"top":0.76,"width":0.030555556,"height":0.04888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"bounds":{"left":0.8763889,"top":0.76222223,"width":0.027777778,"height":0.044444446},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:54","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022916667,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.047222223,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08194444,"top":0.90444446,"width":0.08194444,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08194444,"top":0.93777776,"width":0.08194444,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
-8315142481523474905
|
-4552651091256616168
|
visual_change
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
Mute Steliyan Georgiev's microphone
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:54
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform
DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)PlactormSRD-677X(7 (SRD-67:C Jy 19798(7 Service-oir Font. Amrhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsE3 SSHPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1SubtasksooeeneTxwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-205641 П ****:%JY-20570(FE) Prepare HTML Template for PDFreport9JY-20571(AI) Create PDF from Panorama9 JY-20572(BE) Send emall for generated report(check design)9J1-20573(BE) Manage recipients for emallsending% JY-20574 [AI) Ensure PDF formatting is good% J7-20575work9JY-20576(FE) Add new generated report in theAl reports page(BE) Add flag in AI Reports list for9 JY-20577delete rights% JY-20578 [FE) Add delete button|9JY-20579(BE) Add new report type in filtersoptions% JY-20580 (FE) Rename column Shared% JY-20583 (BE) Add period to cron job9Jy-20584(BE) Change search report in cron jobEl DatadogPriority= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= MediumCostI ClActionsNikolay Y…Steliyan .@ Lukas Ko...* Lukas Ko...Steliyan…..3 Stellyan ...Nikolay Y..Lukas Ko...Nikolay Y..Nikolay Y…Steliyan...Lukas Ko...LUxaS KO..••-+77% DoneDONE MDONEIN DEVDONE MDONE VIDONEDONEIN DEV VDONE MDONE y9:54 AM Daily - PlatformDaily - Platform • now100% <478• Tue 14 Apr 9:54:47Chat8•CE: DeperDependol• Far DepeD TasksInsights & Coachin• DelD ux.© 2Xz Improve Story~ DetailsAssignee 332 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 bulld fallingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcUrEhD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 + ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 |1 0 •*=•=No such column "Type' onentty 'Event"PLATTORM STABTLITYDeployed9 -20384 05 ^ ****=|Missed deletion of stale# *ee(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaNikolay Yankovan GeorgievNikolay Ivanovvalik• 9:22...
|
NULL
|
|
8588
|
NULL
|
0
|
2026-04-14T06:54:54.408652+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149694408_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677(7 (SRD-67:C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Jminmy0 ProjectsE3 SSHPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1SubtasksooeeneTxwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-205641 П ****:%JY-20570(FE) Prepare HTML Template for PDFreport9JY-20571(AI) Create PDF from Panorama9 JY-20572(BE) Send emall for generated report(check design)9J1-20573(BE) Manage recipients for emailsending% JY-20574 [AI) Ensure PDF formatting is good% J7-20575work9JY-20576(FE) Add new generated report in theAl reports page(BE) Add flag in AI Reports list for9 JY-20577delete rights% JY-20578 [FE) Add delete button|9JY-20579(BE) Add new report type in filtersoptions% JY-20580 (FE) Rename column SharedEl DatadogPriority= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium% JY-20583 (BE) Add period to cron job= Medium9Jy-20584(BE) Change search report in cron job= Mecium• Nikolay Y...i Stellyan .@ Lukas Ko...* Lukas Ko...Steliyan…..3 Stellyan ...Lukas Ko...Nikolay Y..Nikolay Y…Steliyan...Lukas Ko...LUxaS KO..Actions••+77% DoneDONE MDONEIN DEVDONE MDONE VIDONEDONEIN DEV VDONE MDONE y9:54 AM Daily - Platform• 0.(ah)Daily - Platform - now8•CE: DeperDependol• Far DepeD TasksInsights & Coachin• DelD ux.© 2Xz Improve Story~ DetailsAssignee &32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 bulld fallingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcUrEhD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 + ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1ot000-No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed8 -20384 05 11 ****=1Missed deletion of stale0 -400R3 409009(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan Georgievvalik9:298 • Tue 14 Apr 9:54:54ChatNikolay YankovNikolay Ivanov...
|
NULL
|
-6696527879926375764
|
NULL
|
visual_change
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677(7 (SRD-67:C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Jminmy0 ProjectsE3 SSHPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1SubtasksooeeneTxwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-205641 П ****:%JY-20570(FE) Prepare HTML Template for PDFreport9JY-20571(AI) Create PDF from Panorama9 JY-20572(BE) Send emall for generated report(check design)9J1-20573(BE) Manage recipients for emailsending% JY-20574 [AI) Ensure PDF formatting is good% J7-20575work9JY-20576(FE) Add new generated report in theAl reports page(BE) Add flag in AI Reports list for9 JY-20577delete rights% JY-20578 [FE) Add delete button|9JY-20579(BE) Add new report type in filtersoptions% JY-20580 (FE) Rename column SharedEl DatadogPriority= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium= Medium% JY-20583 (BE) Add period to cron job= Medium9Jy-20584(BE) Change search report in cron job= Mecium• Nikolay Y...i Stellyan .@ Lukas Ko...* Lukas Ko...Steliyan…..3 Stellyan ...Lukas Ko...Nikolay Y..Nikolay Y…Steliyan...Lukas Ko...LUxaS KO..Actions••+77% DoneDONE MDONEIN DEVDONE MDONE VIDONEDONEIN DEV VDONE MDONE y9:54 AM Daily - Platform• 0.(ah)Daily - Platform - now8•CE: DeperDependol• Far DepeD TasksInsights & Coachin• DelD ux.© 2Xz Improve Story~ DetailsAssignee &32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 bulld fallingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcUrEhD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 + ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1ot000-No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed8 -20384 05 11 ****=1Missed deletion of stale0 -400R3 409009(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan Georgievvalik9:298 • Tue 14 Apr 9:54:54ChatNikolay YankovNikolay Ivanov...
|
NULL
|
|
8589
|
165
|
30
|
2026-04-14T06:54:54.908860+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149694908_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarks→ Import bookma FirefoxFileEoitViewHistoryBookmarks→ Import bookmarks..Sprint BoardT SRD QueueGithuh~ Google GeminiProfilesToolsWindow Helpoearen wiin voogle or enter douresePlatform Sprint 1 Q2 - Platform Tea( Jy 19798 evaluation for ai activity t+) Nav Tab+ New Tabselecteu uataset name.about newta.async1o.run(run_evaruation(….)),which orchestrates the collection,endpoint calling, and result scoringphases.are there soem issueswith this PRshow thinkingYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es , the code loopsthrough raw_samples and awaits asession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather()(ideally wrapped in anohore to avoidi• Enter a prompt for GeminiPro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageFirefoxPlatform Sprint1 Q2 - Platfor...Search with Google or enter addressMInbox (1,539)-Meet - Daily -PlatformJY-20543 addAJ reports...WikipediaYouTubeReddit| Daily - Platform • nowA100% C8•Tue 14 Apr 9:54:54Ö:22°CNew York CityAdd-ons forFirefox...
|
NULL
|
121605695049758242
|
NULL
|
visual_change
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarks→ Import bookma FirefoxFileEoitViewHistoryBookmarks→ Import bookmarks..Sprint BoardT SRD QueueGithuh~ Google GeminiProfilesToolsWindow Helpoearen wiin voogle or enter douresePlatform Sprint 1 Q2 - Platform Tea( Jy 19798 evaluation for ai activity t+) Nav Tab+ New Tabselecteu uataset name.about newta.async1o.run(run_evaruation(….)),which orchestrates the collection,endpoint calling, and result scoringphases.are there soem issueswith this PRshow thinkingYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es , the code loopsthrough raw_samples and awaits asession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather()(ideally wrapped in anohore to avoidi• Enter a prompt for GeminiPro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageFirefoxPlatform Sprint1 Q2 - Platfor...Search with Google or enter addressMInbox (1,539)-Meet - Daily -PlatformJY-20543 addAJ reports...WikipediaYouTubeReddit| Daily - Platform • nowA100% C8•Tue 14 Apr 9:54:54Ö:22°CNew York CityAdd-ons forFirefox...
|
NULL
|
|
8591
|
167
|
0
|
2026-04-14T06:55:06.977785+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149706977_m2.jpg...
|
Firefox
|
Work — Mozilla Firefox
|
1
|
app.staging.jiminny.com
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
New Tab
New Tab
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Listen
Show more options
Show thinking
Gemini said
Gemini said
Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues, categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
, the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use
asyncio.gather()
(ideally wrapped in an
asyncio.Semaphore
to avoid overwhelming the endpoint) to run the replayability probes concurrently.
Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):
Standard synchronous file operations (
with open(...)
,
json.dump
,
json.load
) are used inside
async def
functions. In Python, this blocks the entire event loop while the disk is reading/writing.
Fix:
Use an asynchronous file I/O library like
aiofiles
, or run the blocking I/O operations in a separate thread using
asyncio.to_thread()
.
Architecture & State Management
Architecture & State Management
Mutating Class-Level Variables (Lines 323-328):
The script parses CLI arguments and directly mutates the class attributes (e.g.,
AiActivityTypeEvaluator.DATA_SET = args.data_set
). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.
Fix:
Pass these arguments into the
__init__
constructor of the instance (
self.data_set = args.data_set
) rather than modifying the class definition globally.
Code Quality & Edge Cases
Code Quality & Edge Cases
Broad Exception Catching (Line 107):
Using a bare
except Exception:
during the replayability probe is dangerous. It will catch
aiohttp.ClientError
(which is intended), but it will also silently swallow
TimeoutError
or even memory errors, making debugging incredibly difficult if the endpoint goes down.
Fix:
Scope the exception to specific expected errors, like
except (aiohttp.ClientError, asyncio.TimeoutError):
.
Inefficient Size Calculation (Line 95):
len(json.dumps(request_body)) < min_size
requires serializing the entire dictionary to a string
just
to check its length. Since
min_size
defaults to
0
, the
if min_size
short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.
Potential Path Traversal Risk (Lines 110, 113, 251):
The script writes and reads files directly using
f"{es_id}.json"
and
f"gt_{req_id}.json"
. If
es_id
or
req_id
originate from a database and ever contain characters like
../
, it could result in files being read from or written to unintended directories.
Fix:
Sanitize the IDs or strictly enforce that they only contain alphanumeric characters....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app","depth":4,"bounds":{"left":0.00234375,"top":0.06458333,"width":0.0890625,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":4,"bounds":{"left":0.0,"top":0.10208333,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":5,"bounds":{"left":0.015625,"top":0.11180556,"width":0.11796875,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":4,"bounds":{"left":0.0,"top":0.13055556,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":5,"bounds":{"left":0.015625,"top":0.14027777,"width":0.1984375,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"New Tab","depth":4,"bounds":{"left":0.0,"top":0.15902779,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"New Tab","depth":5,"bounds":{"left":0.015625,"top":0.16875,"width":0.017578125,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.07890625,"top":0.16527778,"width":0.009375,"height":0.016666668},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.003125,"top":0.18888889,"width":0.08710937,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.003125,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.01640625,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Tabs from other devices","depth":6,"bounds":{"left":0.029296875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.0421875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.05546875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.2171875,"top":0.06736111,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.23125,"top":0.06736111,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"WORK, Google Account: lukas.kovalik@jiminny.com","depth":12,"bounds":{"left":0.228125,"top":0.10972222,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.0984375,"top":0.10972222,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Chat","depth":12,"bounds":{"left":0.1953125,"top":0.10972222,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.2109375,"top":0.10972222,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.09335937,"top":0.14791666,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.09335937,"top":0.15,"width":0.14101562,"height":0.022222223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":21,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":22,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Gemini said","depth":20,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This Python script defines the","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator","depth":25,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Core Functionality","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Core Functionality","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Data Collection (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"collect_test_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"* Supports fetching raw samples from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Elasticsearch (ES)","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or loading them from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"local JSON files","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Implements a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"replayability check","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":": It probes the live API endpoint to ensure the activity exists before including it in the dataset.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Filters data based on request body size, language, and the presence of an original response.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Ground Truth Management:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Saves inputs and \"gold standard\" responses to a local directory structure (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"eval_output/ai_activity_type/...","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":").","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Features an optional mode to populate ground truth using a high-order LLM (e.g.,","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"gpt-5_2","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") if the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"POPULATE_GROUND_TRUTH_LLM","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"flag is set.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Evaluation Logic (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"evaluate_against_ground_truth","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Compares the model's predicted","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ai_activity_type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"against either the original ES response or the LLM-generated ground truth.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Assigns a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"quality_score","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"of","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a match and","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a mismatch or request failure.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Key Configuration Constants","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Key Configuration Constants","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy table","depth":24,"bounds":{"left":0.22070312,"top":0.0,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Execution Workflow","depth":23,"bounds":{"left":0.103125,"top":0.004166667,"width":0.13320312,"height":0.016666668},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Execution Workflow","depth":24,"bounds":{"left":0.103125,"top":0.0055555557,"width":0.060546875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Parsing:","depth":26,"bounds":{"left":0.1171875,"top":0.028472222,"width":0.024609376,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script uses","depth":26,"bounds":{"left":0.14179687,"top":0.028472222,"width":0.04609375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"create_parser","depth":27,"bounds":{"left":0.19023438,"top":0.029861111,"width":0.042578124,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to handle CLI arguments like","depth":26,"bounds":{"left":0.1171875,"top":0.028472222,"width":0.119140625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--data-set","depth":27,"bounds":{"left":0.1171875,"top":0.047916666,"width":0.107421875,"height":0.03125},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.12929687,"top":0.06458333,"width":0.003125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--populate-ground-truth-llm","depth":27,"bounds":{"left":0.13476562,"top":0.065972224,"width":0.08867188,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", and","depth":26,"bounds":{"left":0.1171875,"top":0.06458333,"width":0.11171875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--compare-to-ground-truth","depth":27,"bounds":{"left":0.13203125,"top":0.084027775,"width":0.08203125,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"bounds":{"left":0.21640626,"top":0.08263889,"width":0.0015625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Initialization:","depth":26,"bounds":{"left":0.1171875,"top":0.10625,"width":0.039453126,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sets up directory paths for inputs and ground truths based on the selected dataset name.","depth":26,"bounds":{"left":0.1171875,"top":0.10625,"width":0.11875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Run:","depth":26,"bounds":{"left":0.1171875,"top":0.16597222,"width":0.013671875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Executes the evaluation via","depth":26,"bounds":{"left":0.13085938,"top":0.16597222,"width":0.080078125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.run(run_evaluation(...))","depth":27,"bounds":{"left":0.11953125,"top":0.18541667,"width":0.10507812,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", which orchestrates the collection, endpoint calling, and result scoring phases.","depth":26,"bounds":{"left":0.1171875,"top":0.18402778,"width":0.11289062,"height":0.06875},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.1203125,"top":0.3013889,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.1375,"top":0.3013889,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said are there soem issues with this PR","depth":21,"bounds":{"left":0.1609375,"top":0.30972221,"width":0.072265625,"height":0.03888889},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.09335937,"top":0.31180555,"width":0.0234375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"are there soem issues with this PR","depth":23,"bounds":{"left":0.1609375,"top":0.3125,"width":0.062109374,"height":0.034027778},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":22,"bounds":{"left":0.22382812,"top":0.37638888,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"bounds":{"left":0.22382812,"top":0.3715278,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":26,"bounds":{"left":0.11953125,"top":0.3784722,"width":0.036328126,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":20,"bounds":{"left":0.11679687,"top":0.41458333,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"bounds":{"left":0.11679687,"top":0.41597223,"width":0.048046876,"height":0.016666668},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.","depth":24,"bounds":{"left":0.10390625,"top":0.42291668,"width":0.13242188,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of the most notable issues, categorized by impact:","depth":24,"bounds":{"left":0.10390625,"top":0.48819444,"width":0.11640625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Performance & Async Anti-Patterns","depth":23,"bounds":{"left":0.10390625,"top":0.5416667,"width":0.13242188,"height":0.016666668},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Performance & Async Anti-Patterns","depth":24,"bounds":{"left":0.10390625,"top":0.54305553,"width":0.10859375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sequential Requests Defeating Async (Lines 86-106):","depth":26,"bounds":{"left":0.11757813,"top":0.5659722,"width":0.11601563,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Inside","depth":26,"bounds":{"left":0.16289063,"top":0.58402777,"width":0.019921875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"_collect_from_es","depth":27,"bounds":{"left":0.11992188,"top":0.60347223,"width":0.05234375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", the code loops through","depth":26,"bounds":{"left":0.11757813,"top":0.6020833,"width":0.10429688,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"raw_samples","depth":27,"bounds":{"left":0.14414063,"top":0.6215278,"width":0.0359375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and","depth":26,"bounds":{"left":0.18242188,"top":0.6201389,"width":0.0140625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"await","depth":27,"bounds":{"left":0.19882813,"top":0.6215278,"width":0.01640625,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"s a","depth":26,"bounds":{"left":0.21757813,"top":0.6201389,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"session.post","depth":27,"bounds":{"left":0.11992188,"top":0.63958335,"width":0.039453126,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using","depth":26,"bounds":{"left":0.11757813,"top":0.63819444,"width":0.11640625,"height":0.08680555},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiohttp","depth":27,"bounds":{"left":0.18046875,"top":0.7118056,"width":0.023046875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for asynchronous concurrency.","depth":26,"bounds":{"left":0.11757813,"top":0.7104167,"width":0.09921875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"bounds":{"left":0.1296875,"top":0.75208336,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Collect these coroutines into a list and use","depth":28,"bounds":{"left":0.1296875,"top":0.75208336,"width":0.09921875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.gather()","depth":29,"bounds":{"left":0.16601562,"top":0.77152777,"width":0.05234375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(ideally wrapped in an","depth":28,"bounds":{"left":0.1296875,"top":0.77013886,"width":0.09257813,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.Semaphore","depth":29,"bounds":{"left":0.13203125,"top":0.8076389,"width":0.055859376,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to avoid overwhelming the endpoint) to run the replayability probes concurrently.","depth":28,"bounds":{"left":0.1296875,"top":0.80625,"width":0.10546875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):","depth":26,"bounds":{"left":0.11757813,"top":0.8659722,"width":0.10976563,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Standard synchronous file operations (","depth":26,"bounds":{"left":0.11757813,"top":0.8840278,"width":0.107421875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"with open(...)","depth":27,"bounds":{"left":0.12226562,"top":0.9215278,"width":0.045703124,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.1703125,"top":0.9201389,"width":0.003515625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"json.dump","depth":27,"bounds":{"left":0.17617187,"top":0.9215278,"width":0.029296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.2078125,"top":0.9201389,"width":0.003125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"json.load","depth":27,"bounds":{"left":0.11992188,"top":0.93958336,"width":0.0296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") are used inside","depth":26,"bounds":{"left":0.15195313,"top":0.93819445,"width":0.048046876,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"async def","depth":27,"bounds":{"left":0.20234375,"top":0.93958336,"width":0.0296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"functions. In Python, this blocks the entire event loop while the disk is reading/writing.","depth":26,"bounds":{"left":0.11757813,"top":0.93819445,"width":0.11875,"height":0.061805546},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"bounds":{"left":0.1296875,"top":1.0,"width":0.009765625,"height":-0.015972257},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Use an asynchronous file I/O library like","depth":28,"bounds":{"left":0.1296875,"top":1.0,"width":0.09335937,"height":-0.015972257},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiofiles","depth":29,"bounds":{"left":0.1625,"top":1.0,"width":0.0265625,"height":-0.035416722},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", or run the blocking I/O operations in a separate thread using","depth":28,"bounds":{"left":0.1296875,"top":1.0,"width":0.10585938,"height":-0.034027815},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.to_thread()","depth":29,"bounds":{"left":0.16914062,"top":1.0,"width":0.0625,"height":-0.07152772},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"bounds":{"left":0.23398438,"top":1.0,"width":0.0015625,"height":-0.07013893},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Architecture & State Management","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Architecture & State Management","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Mutating Class-Level Variables (Lines 323-328):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script parses CLI arguments and directly mutates the class attributes (e.g.,","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator.DATA_SET = args.data_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Pass these arguments into the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"__init__","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"constructor of the instance (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"self.data_set = args.data_set","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") rather than modifying the class definition globally.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Code Quality & Edge Cases","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Code Quality & Edge Cases","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Broad Exception Catching (Line 107):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Using a bare","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"except Exception:","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"during the replayability probe is dangerous. It will catch","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiohttp.ClientError","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(which is intended), but it will also silently swallow","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TimeoutError","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or even memory errors, making debugging incredibly difficult if the endpoint goes down.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Scope the exception to specific expected errors, like","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"except (aiohttp.ClientError, asyncio.TimeoutError):","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Inefficient Size Calculation (Line 95):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"len(json.dumps(request_body)) < min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"requires serializing the entire dictionary to a string","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"just","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to check its length. Since","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"defaults to","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", the","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"if min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Potential Path Traversal Risk (Lines 110, 113, 251):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script writes and reads files directly using","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"f\"{es_id}.json\"","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"f\"gt_{req_id}.json\"","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". If","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es_id","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"req_id","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"originate from a database and ever contain characters like","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"../","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", it could result in files being read from or written to unintended directories.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sanitize the IDs or strictly enforce that they only contain alphanumeric characters.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4930158969472011583
|
8732321575041020375
|
visual_change
|
accessibility
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
New Tab
New Tab
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Listen
Show more options
Show thinking
Gemini said
Gemini said
Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues, categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
, the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use
asyncio.gather()
(ideally wrapped in an
asyncio.Semaphore
to avoid overwhelming the endpoint) to run the replayability probes concurrently.
Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):
Standard synchronous file operations (
with open(...)
,
json.dump
,
json.load
) are used inside
async def
functions. In Python, this blocks the entire event loop while the disk is reading/writing.
Fix:
Use an asynchronous file I/O library like
aiofiles
, or run the blocking I/O operations in a separate thread using
asyncio.to_thread()
.
Architecture & State Management
Architecture & State Management
Mutating Class-Level Variables (Lines 323-328):
The script parses CLI arguments and directly mutates the class attributes (e.g.,
AiActivityTypeEvaluator.DATA_SET = args.data_set
). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.
Fix:
Pass these arguments into the
__init__
constructor of the instance (
self.data_set = args.data_set
) rather than modifying the class definition globally.
Code Quality & Edge Cases
Code Quality & Edge Cases
Broad Exception Catching (Line 107):
Using a bare
except Exception:
during the replayability probe is dangerous. It will catch
aiohttp.ClientError
(which is intended), but it will also silently swallow
TimeoutError
or even memory errors, making debugging incredibly difficult if the endpoint goes down.
Fix:
Scope the exception to specific expected errors, like
except (aiohttp.ClientError, asyncio.TimeoutError):
.
Inefficient Size Calculation (Line 95):
len(json.dumps(request_body)) < min_size
requires serializing the entire dictionary to a string
just
to check its length. Since
min_size
defaults to
0
, the
if min_size
short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.
Potential Path Traversal Risk (Lines 110, 113, 251):
The script writes and reads files directly using
f"{es_id}.json"
and
f"gt_{req_id}.json"
. If
es_id
or
req_id
originate from a database and ever contain characters like
../
, it could result in files being read from or written to unintended directories.
Fix:
Sanitize the IDs or strictly enforce that they only contain alphanumeric characters....
|
NULL
|
|
8594
|
167
|
2
|
2026-04-14T06:55:10.912813+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149710912_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.siaemne.minny.com~ Google Gemini.. XPlatform Sprint 1 Q2 - Platform TeaJy 19798 evaluation for ai activity t• Jiminny+ New labsocc ce Cdcseciie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoesesare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await s asession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio.Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page•< 40 ll © | Daily - Platform • 10m leftA 100%C &• Tue 14 Apr 9:55:10Sign InSign in with ffoogleSign in with Office 365Login via SSOEUWorldwideDon't have account yet? Speak to your sales leader to get an invite.By using this service you agree to the Privacy Policy and Terms of Service.TERMS/KNOWLEDGE BASE BLOG...
|
NULL
|
-1239825134536860121
|
NULL
|
click
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.siaemne.minny.com~ Google Gemini.. XPlatform Sprint 1 Q2 - Platform TeaJy 19798 evaluation for ai activity t• Jiminny+ New labsocc ce Cdcseciie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoesesare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await s asession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio.Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize page•< 40 ll © | Daily - Platform • 10m leftA 100%C &• Tue 14 Apr 9:55:10Sign InSign in with ffoogleSign in with Office 365Login via SSOEUWorldwideDon't have account yet? Speak to your sales leader to get an invite.By using this service you agree to the Privacy Policy and Terms of Service.TERMS/KNOWLEDGE BASE BLOG...
|
NULL
|
|
8596
|
167
|
4
|
2026-04-14T06:55:16.036688+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149716036_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesWindowH FirefoxFileEoitViewHistoryBookmarksProfilesWindowHelpPlatform Sprint 1 Q2 - Platform TeaJy 19798 evaluation for ai activity t*Jiminny+ New labToolsdoo.sideme.minny.com/casnooara~ Google Gemini•. Xsoccce cdcselnde.3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringOeses< 40 ll1 Daily - Platform • 10m left100% CD8 • Tue 14 Apr 9:55:15My RecordingsEveryone's RecordingsTrending this monthLive FeedSNo RecordingsScheduleThis WeekEveryone's Scheduleare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await s asession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio.Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageUnknown CustomerRefinement - Processing|Wednesday, 12:30 AM|Uпknown customerProcessing tickets reviewWednesday, 4:30 AM|Robinson frusoe Cruises Limited ®Sprint RevievClosed LostWednesday, 6:00 AM|Veselin Kulov listened to call 6d/ADr. 2.51 PMIactivity with unknown customerf Held: 7 Apr, 2:36 PM|•) Duration: 7ml£ Value: $0Nikolay Nikolov listened to call 6d27 Feb, 7:38 AMWeb Demo with Martin PetkovHeld: 13 Feb, 5:15 AM© Duration: 11ml§ Value: $11Nikolay Yankov listened to call 6d12 Feb, 1:07 AMacuivicy wicniNikolay тапкоvHe Held: T7 Feh. "7:47 AMI@ Duration: 1ml≥ Value: $0Nikolay Yankov listened to call 6d12 Feb, 1:06 AMaccivity with Nikolav YankovIEB Held: 17 Feb. 17:42 AMI© Duration: 1m l≥ Value: $01Martin Petkov listened to call 6d22 Jan, 4:30 AM|Discovery with Martin Petkov|F Held: 21 Jan, 7:00 AM|Ô Duration: 35m≥ Value: $21Martin Petkov listened to call 6dZU Jan. 24ZAMIactivity with Jonathan HeatonE Held: 25 Sep, 2024, 3:33 AM© Duration: 37ml& Value: £8,886,Galya Dimitrova listened to call 6dZ6 Nov. 2025,4:29 AMIactivity with David EhrlichE Held: 9 Oct, 2024, 5:31 AM|• Duration: 39m≥ Value: $23,000)Aneliya Angelova listened to call 6д18 Auo. 7075. 10'45 AMICold call with Robinson Crusoe New 13333f Held: 23 Jul, 2025, 7:43 AM|© Duration: 2m& Value: $5,Jiminny Web SA shared in Slack 2Cold call with Robinson Crusoe New 13333|Held: 23 Jul, 2025, 7:43 AM@ Duration: 2mllia.app.staging.jiminny.com…....
|
NULL
|
7599185317630420426
|
NULL
|
visual_change
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesWindowH FirefoxFileEoitViewHistoryBookmarksProfilesWindowHelpPlatform Sprint 1 Q2 - Platform TeaJy 19798 evaluation for ai activity t*Jiminny+ New labToolsdoo.sideme.minny.com/casnooara~ Google Gemini•. Xsoccce cdcselnde.3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringOeses< 40 ll1 Daily - Platform • 10m left100% CD8 • Tue 14 Apr 9:55:15My RecordingsEveryone's RecordingsTrending this monthLive FeedSNo RecordingsScheduleThis WeekEveryone's Scheduleare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await s asession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio.Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageUnknown CustomerRefinement - Processing|Wednesday, 12:30 AM|Uпknown customerProcessing tickets reviewWednesday, 4:30 AM|Robinson frusoe Cruises Limited ®Sprint RevievClosed LostWednesday, 6:00 AM|Veselin Kulov listened to call 6d/ADr. 2.51 PMIactivity with unknown customerf Held: 7 Apr, 2:36 PM|•) Duration: 7ml£ Value: $0Nikolay Nikolov listened to call 6d27 Feb, 7:38 AMWeb Demo with Martin PetkovHeld: 13 Feb, 5:15 AM© Duration: 11ml§ Value: $11Nikolay Yankov listened to call 6d12 Feb, 1:07 AMacuivicy wicniNikolay тапкоvHe Held: T7 Feh. "7:47 AMI@ Duration: 1ml≥ Value: $0Nikolay Yankov listened to call 6d12 Feb, 1:06 AMaccivity with Nikolav YankovIEB Held: 17 Feb. 17:42 AMI© Duration: 1m l≥ Value: $01Martin Petkov listened to call 6d22 Jan, 4:30 AM|Discovery with Martin Petkov|F Held: 21 Jan, 7:00 AM|Ô Duration: 35m≥ Value: $21Martin Petkov listened to call 6dZU Jan. 24ZAMIactivity with Jonathan HeatonE Held: 25 Sep, 2024, 3:33 AM© Duration: 37ml& Value: £8,886,Galya Dimitrova listened to call 6dZ6 Nov. 2025,4:29 AMIactivity with David EhrlichE Held: 9 Oct, 2024, 5:31 AM|• Duration: 39m≥ Value: $23,000)Aneliya Angelova listened to call 6д18 Auo. 7075. 10'45 AMICold call with Robinson Crusoe New 13333f Held: 23 Jul, 2025, 7:43 AM|© Duration: 2m& Value: $5,Jiminny Web SA shared in Slack 2Cold call with Robinson Crusoe New 13333|Held: 23 Jul, 2025, 7:43 AM@ Duration: 2mllia.app.staging.jiminny.com…....
|
NULL
|
|
8597
|
166
|
1
|
2026-04-14T06:55:17.879190+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149717879_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)E:1Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. AmrCostI ClActions shttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 Projectslã AwsE3 SSHEl DatadogPlatform TeamQ Search board209$ Jy-19240 / [ JY-18900READY FOR DEV 1Subtasksoohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-205641 П ****:PriorityStor...JY-20574 [Ai) Ensure PDF formatting is good= Medium9 JY-20582 (BE+AI+Infra] Create new queue= Medium% JY-20586 [QA] Manual Testing= Medium% JY-20626 Integration testing= Mediumis cioned byA JY-20458 CRUD reports with Ask JiminnyJY-20849 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an….JY-20660 Non-creator cannot delete generated Ask Jiminny report from A..JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...• JY-20654 Generated Ask Jiminny Report is visible oniy to creator and shar...• Jy-20650 Generated Ask Jiminheport for melodes orene ng. loport se* Claude3 CircieciA, SentrylOCI-Hide doneBulk editView in searchSteneneAnellya A.KEAUTDURUEY8 Unassign...READY FOR DEVSDErLOTEUYDONEMDONEDONEWUntDONE MDONEM9:55 AM | Daily - Platform(nh|Daily - Platform - 10 m left8•CE: DepeDependol• Far DepeD Tasks2 Insights & CoachinD DevD ux.© 2In Dev vXz Improve Story~ DetailsAssignee32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcUrEhD Al BookmarksGroup: QueriesDEPLOY 8leature fiag reieaseDeployed0 -20585 15 + ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 |1 0 •**•=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 11 ****=Missed deletion of stale408099(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan GeorgievKovalik9:538 • Tue 14 Apr 9:55:17ChatNikolay YankovNikolay lvanov...
|
NULL
|
-8158254741196970098
|
NULL
|
click
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)E:1Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. AmrCostI ClActions shttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 Projectslã AwsE3 SSHEl DatadogPlatform TeamQ Search board209$ Jy-19240 / [ JY-18900READY FOR DEV 1Subtasksoohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-205641 П ****:PriorityStor...JY-20574 [Ai) Ensure PDF formatting is good= Medium9 JY-20582 (BE+AI+Infra] Create new queue= Medium% JY-20586 [QA] Manual Testing= Medium% JY-20626 Integration testing= Mediumis cioned byA JY-20458 CRUD reports with Ask JiminnyJY-20849 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an….JY-20660 Non-creator cannot delete generated Ask Jiminny report from A..JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...• JY-20654 Generated Ask Jiminny Report is visible oniy to creator and shar...• Jy-20650 Generated Ask Jiminheport for melodes orene ng. loport se* Claude3 CircieciA, SentrylOCI-Hide doneBulk editView in searchSteneneAnellya A.KEAUTDURUEY8 Unassign...READY FOR DEVSDErLOTEUYDONEMDONEDONEWUntDONE MDONEM9:55 AM | Daily - Platform(nh|Daily - Platform - 10 m left8•CE: DepeDependol• Far DepeD Tasks2 Insights & CoachinD DevD ux.© 2In Dev vXz Improve Story~ DetailsAssignee32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcUrEhD Al BookmarksGroup: QueriesDEPLOY 8leature fiag reieaseDeployed0 -20585 15 + ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 |1 0 •**•=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 11 ****=Missed deletion of stale408099(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan GeorgievKovalik9:538 • Tue 14 Apr 9:55:17ChatNikolay YankovNikolay lvanov...
|
NULL
|
|
8599
|
167
|
6
|
2026-04-14T06:55:19.229887+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149719229_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. X•< 40 ll © | Daily - Platform • 10m leftA 100%C &• Tue 14 Apr 9:55:18Platform Sprint 1 Q2 - Platform TeaJy 19798 evaluation for ai activity t•Jiminny+ New labAl Reports• Report nameNAME €S03 Ask Jiminny reportsPeriodiReport Type© Clear all |socc ce Cdcseciie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringolcses.FREQUENCY #DATE €ACTIONSI10are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await s asession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageapp.staging.fiminnv.com/ai-rgports k-iiminny - 869453...
|
NULL
|
1743723797016129039
|
NULL
|
visual_change
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. X•< 40 ll © | Daily - Platform • 10m leftA 100%C &• Tue 14 Apr 9:55:18Platform Sprint 1 Q2 - Platform TeaJy 19798 evaluation for ai activity t•Jiminny+ New labAl Reports• Report nameNAME €S03 Ask Jiminny reportsPeriodiReport Type© Clear all |socc ce Cdcseciie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringolcses.FREQUENCY #DATE €ACTIONSI10are there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await s asession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageapp.staging.fiminnv.com/ai-rgports k-iiminny - 869453...
|
NULL
|
|
8601
|
167
|
8
|
2026-04-14T06:55:28.298806+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149728298_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. X•< 40 Wl © | Daily -Platform • 10mleft A 100%C &• Tue 14 Apr 9:55:27Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny+ New labsocc ce Cdcseciie3. Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 2026S03 Ask Jiminny reportsPeriodiReport Type© Clear all |10FREQUENCY+DailySHAREDICCanDATEY09/04/9016ACTIONSP Croare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne about peodleYour privacy & GeminiSummarize pageGapp.staging.jiminny.com/a...
|
NULL
|
3214708001850927741
|
NULL
|
visual_change
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. X•< 40 Wl © | Daily -Platform • 10mleft A 100%C &• Tue 14 Apr 9:55:27Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny+ New labsocc ce Cdcseciie3. Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 2026S03 Ask Jiminny reportsPeriodiReport Type© Clear all |10FREQUENCY+DailySHAREDICCanDATEY09/04/9016ACTIONSP Croare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne about peodleYour privacy & GeminiSummarize pageGapp.staging.jiminny.com/a...
|
NULL
|
|
8604
|
167
|
10
|
2026-04-14T06:55:37.342012+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149737342_m2.jpg...
|
Firefox
|
Jiminny — Work
|
1
|
app.staging.jiminny.com/ai-reports
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jiminny
Jiminny
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Listen
Show more options
Show thinking
Gemini said
Gemini said
Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues, categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
, the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use
asyncio.gather()
(ideally wrapped in an
asyncio.Semaphore
to avoid overwhelming the endpoint) to run the replayability probes concurrently.
Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):
Standard synchronous file operations (
with open(...)
,
json.dump
,
json.load
) are used inside
async def
functions. In Python, this blocks the entire event loop while the disk is reading/writing.
Fix:
Use an asynchronous file I/O library like
aiofiles
, or run the blocking I/O operations in a separate thread using
asyncio.to_thread()
.
Architecture & State Management
Architecture & State Management
Mutating Class-Level Variables (Lines 323-328):
The script parses CLI arguments and directly mutates the class attributes (e.g.,
AiActivityTypeEvaluator.DATA_SET = args.data_set
). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.
Fix:
Pass these arguments into the
__init__
constructor of the instance (
self.data_set = args.data_set
) rather than modifying the class definition globally.
Code Quality & Edge Cases
Code Quality & Edge Cases
Broad Exception Catching (Line 107):
Using a bare
except Exception:
during the replayability probe is dangerous. It will catch
aiohttp.ClientError
(which is intended), but it will also silently swallow
TimeoutError
or even memory errors, making debugging incredibly difficult if the endpoint goes down.
Fix:
Scope the exception to specific expected errors, like
except (aiohttp.ClientError, asyncio.TimeoutError):
.
Inefficient Size Calculation (Line 95):
len(json.dumps(request_body)) < min_size
requires serializing the entire dictionary to a string
just
to check its length. Since
min_size
defaults to
0
, the
if min_size
short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.
Potential Path Traversal Risk (Lines 110, 113, 251):
The script writes and reads files directly using
f"{es_id}.json"
and
f"gt_{req_id}.json"
. If...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app","depth":4,"bounds":{"left":0.00234375,"top":0.045138888,"width":0.0890625,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":4,"bounds":{"left":0.0,"top":0.08263889,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":5,"bounds":{"left":0.015625,"top":0.09236111,"width":0.11796875,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":4,"bounds":{"left":0.0,"top":0.11111111,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":5,"bounds":{"left":0.015625,"top":0.12083333,"width":0.1984375,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jiminny","depth":4,"bounds":{"left":0.0,"top":0.13958333,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Jiminny","depth":5,"bounds":{"left":0.015625,"top":0.14930555,"width":0.015625,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.07890625,"top":0.14583333,"width":0.009375,"height":0.016666668},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.003125,"top":0.16944444,"width":0.08710937,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.003125,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.01640625,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Tabs from other devices","depth":6,"bounds":{"left":0.029296875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.0421875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.05546875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.2171875,"top":0.047916666,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.23125,"top":0.047916666,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"WORK, Google Account: lukas.kovalik@jiminny.com","depth":12,"bounds":{"left":0.228125,"top":0.090277776,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.0984375,"top":0.090277776,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Chat","depth":12,"bounds":{"left":0.1953125,"top":0.090277776,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.2109375,"top":0.090277776,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.09335937,"top":0.12847222,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.09335937,"top":0.13055556,"width":0.14101562,"height":0.022222223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":21,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":22,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Gemini said","depth":20,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This Python script defines the","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator","depth":25,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Core Functionality","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Core Functionality","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Data Collection (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"collect_test_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"* Supports fetching raw samples from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Elasticsearch (ES)","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or loading them from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"local JSON files","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Implements a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"replayability check","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":": It probes the live API endpoint to ensure the activity exists before including it in the dataset.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Filters data based on request body size, language, and the presence of an original response.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Ground Truth Management:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Saves inputs and \"gold standard\" responses to a local directory structure (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"eval_output/ai_activity_type/...","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":").","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Features an optional mode to populate ground truth using a high-order LLM (e.g.,","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"gpt-5_2","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") if the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"POPULATE_GROUND_TRUTH_LLM","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"flag is set.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Evaluation Logic (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"evaluate_against_ground_truth","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Compares the model's predicted","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ai_activity_type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"against either the original ES response or the LLM-generated ground truth.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Assigns a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"quality_score","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"of","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a match and","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a mismatch or request failure.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Key Configuration Constants","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Key Configuration Constants","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy table","depth":24,"bounds":{"left":0.22070312,"top":0.0,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Execution Workflow","depth":23,"bounds":{"left":0.103125,"top":0.0,"width":0.13320312,"height":0.016666668},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Execution Workflow","depth":24,"bounds":{"left":0.103125,"top":0.0,"width":0.060546875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Parsing:","depth":26,"bounds":{"left":0.1171875,"top":0.009027778,"width":0.024609376,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script uses","depth":26,"bounds":{"left":0.14179687,"top":0.009027778,"width":0.04609375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"create_parser","depth":27,"bounds":{"left":0.19023438,"top":0.010416667,"width":0.042578124,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to handle CLI arguments like","depth":26,"bounds":{"left":0.1171875,"top":0.009027778,"width":0.119140625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--data-set","depth":27,"bounds":{"left":0.1171875,"top":0.028472222,"width":0.107421875,"height":0.03125},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.12929687,"top":0.045138888,"width":0.003125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--populate-ground-truth-llm","depth":27,"bounds":{"left":0.13476562,"top":0.046527777,"width":0.08867188,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", and","depth":26,"bounds":{"left":0.1171875,"top":0.045138888,"width":0.11171875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--compare-to-ground-truth","depth":27,"bounds":{"left":0.13203125,"top":0.06458333,"width":0.08203125,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"bounds":{"left":0.21640626,"top":0.063194446,"width":0.0015625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Initialization:","depth":26,"bounds":{"left":0.1171875,"top":0.08680555,"width":0.039453126,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sets up directory paths for inputs and ground truths based on the selected dataset name.","depth":26,"bounds":{"left":0.1171875,"top":0.08680555,"width":0.11875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Run:","depth":26,"bounds":{"left":0.1171875,"top":0.14652778,"width":0.013671875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Executes the evaluation via","depth":26,"bounds":{"left":0.13085938,"top":0.14652778,"width":0.080078125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.run(run_evaluation(...))","depth":27,"bounds":{"left":0.11953125,"top":0.16597222,"width":0.10507812,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", which orchestrates the collection, endpoint calling, and result scoring phases.","depth":26,"bounds":{"left":0.1171875,"top":0.16458334,"width":0.11289062,"height":0.06875},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.1203125,"top":0.28194445,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.1375,"top":0.28194445,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said are there soem issues with this PR","depth":21,"bounds":{"left":0.1609375,"top":0.29027778,"width":0.072265625,"height":0.03888889},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.09335937,"top":0.2923611,"width":0.0234375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"are there soem issues with this PR","depth":23,"bounds":{"left":0.1609375,"top":0.29305556,"width":0.062109374,"height":0.034027778},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":22,"bounds":{"left":0.22382812,"top":0.35694444,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"bounds":{"left":0.22382812,"top":0.35208333,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":26,"bounds":{"left":0.11953125,"top":0.35902777,"width":0.036328126,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":20,"bounds":{"left":0.11679687,"top":0.3951389,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"bounds":{"left":0.11679687,"top":0.39652777,"width":0.048046876,"height":0.016666668},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.","depth":24,"bounds":{"left":0.10390625,"top":0.40347221,"width":0.13242188,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of the most notable issues, categorized by impact:","depth":24,"bounds":{"left":0.10390625,"top":0.46875,"width":0.11640625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Performance & Async Anti-Patterns","depth":23,"bounds":{"left":0.10390625,"top":0.5222222,"width":0.13242188,"height":0.016666668},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Performance & Async Anti-Patterns","depth":24,"bounds":{"left":0.10390625,"top":0.5236111,"width":0.10859375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sequential Requests Defeating Async (Lines 86-106):","depth":26,"bounds":{"left":0.11757813,"top":0.5465278,"width":0.11601563,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Inside","depth":26,"bounds":{"left":0.16289063,"top":0.56458336,"width":0.019921875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"_collect_from_es","depth":27,"bounds":{"left":0.11992188,"top":0.58402777,"width":0.05234375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", the code loops through","depth":26,"bounds":{"left":0.11757813,"top":0.58263886,"width":0.10429688,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"raw_samples","depth":27,"bounds":{"left":0.14414063,"top":0.6020833,"width":0.0359375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and","depth":26,"bounds":{"left":0.18242188,"top":0.6006944,"width":0.0140625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"await","depth":27,"bounds":{"left":0.19882813,"top":0.6020833,"width":0.01640625,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"s a","depth":26,"bounds":{"left":0.21757813,"top":0.6006944,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"session.post","depth":27,"bounds":{"left":0.11992188,"top":0.6201389,"width":0.039453126,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using","depth":26,"bounds":{"left":0.11757813,"top":0.61875,"width":0.11640625,"height":0.08680555},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiohttp","depth":27,"bounds":{"left":0.18046875,"top":0.6923611,"width":0.023046875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for asynchronous concurrency.","depth":26,"bounds":{"left":0.11757813,"top":0.6909722,"width":0.09921875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"bounds":{"left":0.1296875,"top":0.7326389,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Collect these coroutines into a list and use","depth":28,"bounds":{"left":0.1296875,"top":0.7326389,"width":0.09921875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.gather()","depth":29,"bounds":{"left":0.16601562,"top":0.75208336,"width":0.05234375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(ideally wrapped in an","depth":28,"bounds":{"left":0.1296875,"top":0.75069445,"width":0.09257813,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.Semaphore","depth":29,"bounds":{"left":0.13203125,"top":0.7881944,"width":0.055859376,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to avoid overwhelming the endpoint) to run the replayability probes concurrently.","depth":28,"bounds":{"left":0.1296875,"top":0.78680557,"width":0.10546875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):","depth":26,"bounds":{"left":0.11757813,"top":0.84652776,"width":0.10976563,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Standard synchronous file operations (","depth":26,"bounds":{"left":0.11757813,"top":0.8645833,"width":0.107421875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"with open(...)","depth":27,"bounds":{"left":0.12226562,"top":0.90208334,"width":0.045703124,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.1703125,"top":0.90069443,"width":0.003515625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"json.dump","depth":27,"bounds":{"left":0.17617187,"top":0.90208334,"width":0.029296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.2078125,"top":0.90069443,"width":0.003125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"json.load","depth":27,"bounds":{"left":0.11992188,"top":0.9201389,"width":0.0296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") are used inside","depth":26,"bounds":{"left":0.15195313,"top":0.91875,"width":0.048046876,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"async def","depth":27,"bounds":{"left":0.20234375,"top":0.9201389,"width":0.0296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"functions. In Python, this blocks the entire event loop while the disk is reading/writing.","depth":26,"bounds":{"left":0.11757813,"top":0.91875,"width":0.11875,"height":0.06875},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"bounds":{"left":0.1296875,"top":0.9965278,"width":0.009765625,"height":0.003472209},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Use an asynchronous file I/O library like","depth":28,"bounds":{"left":0.1296875,"top":0.9965278,"width":0.09335937,"height":0.003472209},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiofiles","depth":29,"bounds":{"left":0.1625,"top":1.0,"width":0.0265625,"height":-0.015972257},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", or run the blocking I/O operations in a separate thread using","depth":28,"bounds":{"left":0.1296875,"top":1.0,"width":0.10585938,"height":-0.014583349},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.to_thread()","depth":29,"bounds":{"left":0.16914062,"top":1.0,"width":0.0625,"height":-0.052083373},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"bounds":{"left":0.23398438,"top":1.0,"width":0.0015625,"height":-0.050694466},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Architecture & State Management","depth":23,"bounds":{"left":0.10390625,"top":1.0,"width":0.13242188,"height":-0.08611107},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Architecture & State Management","depth":24,"bounds":{"left":0.10390625,"top":1.0,"width":0.103515625,"height":-0.087499976},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Mutating Class-Level Variables (Lines 323-328):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script parses CLI arguments and directly mutates the class attributes (e.g.,","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator.DATA_SET = args.data_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Pass these arguments into the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"__init__","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"constructor of the instance (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"self.data_set = args.data_set","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") rather than modifying the class definition globally.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Code Quality & Edge Cases","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Code Quality & Edge Cases","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Broad Exception Catching (Line 107):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Using a bare","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"except Exception:","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"during the replayability probe is dangerous. It will catch","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiohttp.ClientError","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(which is intended), but it will also silently swallow","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TimeoutError","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or even memory errors, making debugging incredibly difficult if the endpoint goes down.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Scope the exception to specific expected errors, like","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"except (aiohttp.ClientError, asyncio.TimeoutError):","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Inefficient Size Calculation (Line 95):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"len(json.dumps(request_body)) < min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"requires serializing the entire dictionary to a string","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"just","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to check its length. Since","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"defaults to","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", the","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"if min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Potential Path Traversal Risk (Lines 110, 113, 251):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script writes and reads files directly using","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"f\"{es_id}.json\"","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"f\"gt_{req_id}.json\"","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". If","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8505520146988350698
|
6426478565827391895
|
visual_change
|
accessibility
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jiminny
Jiminny
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Listen
Show more options
Show thinking
Gemini said
Gemini said
Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues, categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
, the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use
asyncio.gather()
(ideally wrapped in an
asyncio.Semaphore
to avoid overwhelming the endpoint) to run the replayability probes concurrently.
Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):
Standard synchronous file operations (
with open(...)
,
json.dump
,
json.load
) are used inside
async def
functions. In Python, this blocks the entire event loop while the disk is reading/writing.
Fix:
Use an asynchronous file I/O library like
aiofiles
, or run the blocking I/O operations in a separate thread using
asyncio.to_thread()
.
Architecture & State Management
Architecture & State Management
Mutating Class-Level Variables (Lines 323-328):
The script parses CLI arguments and directly mutates the class attributes (e.g.,
AiActivityTypeEvaluator.DATA_SET = args.data_set
). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.
Fix:
Pass these arguments into the
__init__
constructor of the instance (
self.data_set = args.data_set
) rather than modifying the class definition globally.
Code Quality & Edge Cases
Code Quality & Edge Cases
Broad Exception Catching (Line 107):
Using a bare
except Exception:
during the replayability probe is dangerous. It will catch
aiohttp.ClientError
(which is intended), but it will also silently swallow
TimeoutError
or even memory errors, making debugging incredibly difficult if the endpoint goes down.
Fix:
Scope the exception to specific expected errors, like
except (aiohttp.ClientError, asyncio.TimeoutError):
.
Inefficient Size Calculation (Line 95):
len(json.dumps(request_body)) < min_size
requires serializing the entire dictionary to a string
just
to check its length. Since
min_size
defaults to
0
, the
if min_size
short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.
Potential Path Traversal Risk (Lines 110, 113, 251):
The script writes and reads files directly using
f"{es_id}.json"
and
f"gt_{req_id}.json"
. If...
|
NULL
|
|
8605
|
166
|
3
|
2026-04-14T06:55:39.855574+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149739855_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. Amrhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsB AWSE3 SSHEl DatadogPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=Priority% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byA JY-20458 CRUD reports with Ask Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..CostI ClActionsAssignee3 Steliyan...i2 Stellyan .-.Anellya A...& Unassign...DEPLOYEDDONEDONE MDONEDONE MDONEDONEWUnEStatusINDEYVINDEV vREADY FOR DEVREADY FOR DEVS9:55 AM | Daily - PlatformDaily - Platform - 10 m left8•CE: DepeDependol• Far DepeD Tasks2 Insights & CoachinD DevD ux.©2|In Dev vXz Improve Story~ Details32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcD Al BookmarksGroup: QueriesDEPLOY 8leature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1 0 •***=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 ^ ****=|Missed deletion of staleaesese(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan GeorgievValik10:158 • Tue 14 Apr 9:55:39ChatNikolay YankovNikolay Ivanov...
|
NULL
|
6844727409046715082
|
NULL
|
visual_change
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. Amrhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsB AWSE3 SSHEl DatadogPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=Priority% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byA JY-20458 CRUD reports with Ask Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..CostI ClActionsAssignee3 Steliyan...i2 Stellyan .-.Anellya A...& Unassign...DEPLOYEDDONEDONE MDONEDONE MDONEDONEWUnEStatusINDEYVINDEV vREADY FOR DEVREADY FOR DEVS9:55 AM | Daily - PlatformDaily - Platform - 10 m left8•CE: DepeDependol• Far DepeD Tasks2 Insights & CoachinD DevD ux.©2|In Dev vXz Improve Story~ Details32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcD Al BookmarksGroup: QueriesDEPLOY 8leature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1 0 •***=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 ^ ****=|Missed deletion of staleaesese(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan GeorgievValik10:158 • Tue 14 Apr 9:55:39ChatNikolay YankovNikolay Ivanov...
|
NULL
|
|
8608
|
167
|
12
|
2026-04-14T06:55:41.395778+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149741395_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. XC< 40Wl © | Daily -Platform • 10mleft A 100%C &• Tue 14 Apr 9:55:40Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny+ New labsocc ce Cdcseciie3. Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 20263 Ask Jiminny reportsPeriodiReport Type© Clear all |10FREQUENCY +DailySHAREDIDATE09/04/9016ACTIONSare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageJY-18909-automated-reports-ask-iiminny = 869453)...
|
NULL
|
-1409007142649833592
|
NULL
|
click
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. XC< 40Wl © | Daily -Platform • 10mleft A 100%C &• Tue 14 Apr 9:55:40Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny+ New labsocc ce Cdcseciie3. Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 20263 Ask Jiminny reportsPeriodiReport Type© Clear all |10FREQUENCY +DailySHAREDIDATE09/04/9016ACTIONSare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageJY-18909-automated-reports-ask-iiminny = 869453)...
|
NULL
|
|
8610
|
167
|
14
|
2026-04-14T06:55:49.519436+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149749519_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. XC< 40Wl © | Daily -Platform • 10mleft A 100%C &• Tue 14 Apr 9:55:48Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny+ New labsocc ce Cdcseciie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 20263 Ask Jiminny reportsPeriodiReport Type© Clear all |10FREQUENCY+DailySHAREDIDATE09/04/9016ACTIONSare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageJY-18909-automated-reports-ask-iiminny = 869453)...
|
NULL
|
1706895816249568843
|
NULL
|
visual_change
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. XC< 40Wl © | Daily -Platform • 10mleft A 100%C &• Tue 14 Apr 9:55:48Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny+ New labsocc ce Cdcseciie3.Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 20263 Ask Jiminny reportsPeriodiReport Type© Clear all |10FREQUENCY+DailySHAREDIDATE09/04/9016ACTIONSare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageJY-18909-automated-reports-ask-iiminny = 869453)...
|
NULL
|
|
8611
|
166
|
5
|
2026-04-14T06:55:55.704139+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149755704_m1.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:55
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.045833334,"top":0.11,"width":0.125,"height":0.022222223},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.045833334,"top":0.11111111,"width":0.125,"height":0.02},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.8826389,"top":0.09888889,"width":0.04097222,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.91041666,"top":0.11,"width":0.0048611113,"height":0.016666668},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.9291667,"top":0.09888889,"width":0.025,"height":0.04},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.93194443,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.9625,"top":0.11,"width":0.022222223,"height":0.016666668},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.9604167,"top":0.1,"width":0.023611112,"height":0.037777778},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"bounds":{"left":0.6201389,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"bounds":{"left":0.65347224,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"bounds":{"left":0.68680555,"top":0.7922222,"width":0.027777778,"height":0.044444446},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"bounds":{"left":0.7430556,"top":0.36666667,"width":0.08888889,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"bounds":{"left":0.87222224,"top":0.36666667,"width":0.077083334,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"bounds":{"left":0.7430556,"top":0.6188889,"width":0.09097222,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"bounds":{"left":0.87222224,"top":0.6188889,"width":0.075,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"bounds":{"left":0.7423611,"top":0.8711111,"width":0.06944445,"height":0.022222223},"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"bounds":{"left":0.9618056,"top":0.86777776,"width":0.018055556,"height":0.028888889},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:55","depth":10,"bounds":{"left":0.020833334,"top":0.93777776,"width":0.022916667,"height":0.022222223},"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"bounds":{"left":0.046527777,"top":0.93777776,"width":0.018055556,"height":0.022222223},"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"bounds":{"left":0.08194444,"top":0.90444446,"width":0.08125,"height":0.08888889},"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"bounds":{"left":0.08194444,"top":0.93777776,"width":0.08125,"height":0.022222223},"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"bounds":{"left":0.30416667,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"bounds":{"left":0.33194444,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"bounds":{"left":0.37083334,"top":0.9222222,"width":0.06111111,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"bounds":{"left":0.3986111,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"bounds":{"left":0.4375,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"bounds":{"left":0.48194444,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"bounds":{"left":0.5263889,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"bounds":{"left":0.5708333,"top":0.9222222,"width":0.03888889,"height":0.053333335},"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"bounds":{"left":0.61527777,"top":0.9222222,"width":0.025,"height":0.053333335},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"bounds":{"left":0.6458333,"top":0.9222222,"width":0.05,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"bounds":{"left":0.8875,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"bounds":{"left":0.92083335,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"bounds":{"left":0.95416665,"top":0.9222222,"width":0.033333335,"height":0.053333335},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.004166667,"top":0.08111111,"width":0.00069444446,"height":0.0011111111},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.11111111,"top":0.04111111,"width":0.21597221,"height":0.033333335},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
6194137157482527795
|
6113912317546658616
|
click
|
accessibility
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Zoom in
Open in new window
Enter Full Screen
Stefka Stoyanova
Nikolay Yankov
Steliyan Georgiev
Nikolay Ivanov
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:55
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Your microphone is off.
meet.google.com / Daily - Platform...
|
NULL
|
|
8613
|
166
|
6
|
2026-04-14T06:55:57.999851+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149757999_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFileEditViewTabsBookmarksHistoryExtensionsWindo DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)SRD-677XF7 (SRD-67C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsE3 SSHEl DatadogPlatform Team200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=PriorityStor..% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byJY-20458 CRUD reports with Ask. Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..Assignee3 Steliyan...i2 Stellyan .-.Anellya A...& Unassign...DEPLOYEDDONEDONE MDONEDONE MDONEDONEWUnEActionsStatusINDEYVINDEV vREADY FOR DEV <READY FOR DEVSTurn on microphone (88 + d)9:55 AM | Daily - Platform(ablDaily - Platform - 10 m left100% [8• Tue 14 Apr 9:55:57ChatCE: DeperDependolD TasksInsights & Coachin• DelD ux.© 2Xz Improve Story~ Details32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failing5 days agcRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons8•• Far DepeGroup: QueriesDEPLOY &leature fiag releaseDeployed12006515 * *00=AI Review - Q1CTNMEUANANORCORDeployed_19982 |1 0 •*=•=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 11 ****=Missed deletion of stale0 -400R3 409009(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaNikolay YankovSteliyan GeorgievNikolay lvanov#Lukas Kovalik10:33...
|
NULL
|
-7529556405988642857
|
NULL
|
visual_change
|
ocr
|
NULL
|
DiaFileEditViewTabsBookmarksHistoryExtensionsWindo DiaFileEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)SRD-677XF7 (SRD-67C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsE3 SSHEl DatadogPlatform Team200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=PriorityStor..% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byJY-20458 CRUD reports with Ask. Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..Assignee3 Steliyan...i2 Stellyan .-.Anellya A...& Unassign...DEPLOYEDDONEDONE MDONEDONE MDONEDONEWUnEActionsStatusINDEYVINDEV vREADY FOR DEV <READY FOR DEVSTurn on microphone (88 + d)9:55 AM | Daily - Platform(ablDaily - Platform - 10 m left100% [8• Tue 14 Apr 9:55:57ChatCE: DeperDependolD TasksInsights & Coachin• DelD ux.© 2Xz Improve Story~ Details32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failing5 days agcRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons8•• Far DepeGroup: QueriesDEPLOY &leature fiag releaseDeployed12006515 * *00=AI Review - Q1CTNMEUANANORCORDeployed_19982 |1 0 •*=•=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 11 ****=Missed deletion of stale0 -400R3 409009(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaNikolay YankovSteliyan GeorgievNikolay lvanov#Lukas Kovalik10:33...
|
NULL
|
|
8614
|
167
|
16
|
2026-04-14T06:55:58.291682+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149758291_m2.jpg...
|
Dia
|
Work: Meet - Daily - Pla…
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
Mute Steliyan Georgiev's microphone
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:55
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Turn on microphone (⌘ + d)
Your microphone is off.
meet.google.com / Daily - Platform...
|
[{"role":"AXHeading","text" [{"role":"AXHeading","text":"Nikolay Yankov (Presenting)","depth":13,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.068750024},"role_description":"heading"},{"role":"AXStaticText","text":"Nikolay Yankov (Presenting)","depth":14,"bounds":{"left":0.2589844,"top":1.0,"width":0.0703125,"height":-0.06944442},"role_description":"text"},{"role":"AXPopUpButton","text":"People","depth":15,"bounds":{"left":0.7296875,"top":1.0,"width":0.023046875,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"6","depth":18,"bounds":{"left":0.7453125,"top":1.0,"width":0.002734375,"height":-0.068750024},"role_description":"text"},{"role":"AXPopUpButton","text":"Take notes with Gemini","depth":15,"bounds":{"left":0.7558594,"top":1.0,"width":0.0140625,"height":-0.061805606},"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Take notes with Gemini","depth":19,"bounds":{"left":0.75742185,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXStaticText","text":"Gemini","depth":19,"bounds":{"left":0.7746094,"top":1.0,"width":0.0125,"height":-0.068750024},"role_description":"text"},{"role":"AXButton","text":"Gemini","depth":18,"bounds":{"left":0.7734375,"top":1.0,"width":0.01328125,"height":-0.0625},"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Unpin Nikolay Yankov's presentation from your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else's presentation","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Zoom in","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open in new window","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Enter Full Screen","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Pin Stefka Stoyanova to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Stefka Stoyanova","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Stefka Stoyanova","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Yankov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Nikolay Yankov's microphone","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Yankov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Yankov","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Steliyan Georgiev to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Mute Steliyan Georgiev's microphone","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Steliyan Georgiev","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Steliyan Georgiev","depth":17,"role_description":"text"},{"role":"AXPopUpButton","text":"Pin Nikolay Ivanov to your main screen","depth":15,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"You can't unmute someone else","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Nikolay Ivanov","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Nikolay Ivanov","depth":17,"role_description":"text"},{"role":"AXButton","text":"You’re continuously framed","depth":15,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Backgrounds and effects","depth":15,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options for Lukas Kovalik","depth":16,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Lukas Kovalik","depth":17,"role_description":"text"},{"role":"AXButton","text":"Others might see more of your background. Click to view your full video.","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"9:55","depth":10,"role_description":"text"},{"role":"AXStaticText","text":"AM","depth":10,"role_description":"text"},{"role":"AXHeading","text":"Daily - Platform","depth":10,"role_description":"heading"},{"role":"AXStaticText","text":"Daily - Platform","depth":13,"role_description":"text"},{"role":"AXPopUpButton","text":"Audio settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on microphone","depth":14,"role_description":"button","is_enabled":true,"is_focused":true,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"Video settings","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn off camera","depth":14,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Nikolay Yankov is presenting","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Send a reaction","depth":12,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Turn on captions","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXCheckBox","text":"Raise hand (ctrl + ⌘ + h)","depth":11,"role_description":"toggle button","subrole":"AXToggleButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXPopUpButton","text":"More options","depth":14,"role_description":"pop up button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Leave call","depth":11,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting details","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Chat with everyone","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Meeting tools","depth":12,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Turn on microphone (⌘ + d)","depth":6,"role_description":"text"},{"role":"AXStaticText","text":"Your microphone is off.","depth":4,"bounds":{"left":0.23554687,"top":1.0,"width":0.000390625,"height":-0.050694466},"role_description":"text"},{"role":"AXTextArea","text":"meet.google.com / Daily - Platform","depth":5,"bounds":{"left":0.2957031,"top":1.0,"width":0.12148438,"height":-0.02569449},"automation_id":"navigationBarAssistantBarTextField","value":"meet.google.com / Daily - Platform","role_description":"text entry area","is_focused":false}]...
|
1966615636940410072
|
-3397478336591832296
|
click
|
hybrid
|
NULL
|
Nikolay Yankov (Presenting)
Nikolay Yankov (Presen Nikolay Yankov (Presenting)
Nikolay Yankov (Presenting)
People
6
Take notes with Gemini
Take notes with Gemini
Gemini
Gemini
Unpin Nikolay Yankov's presentation from your main screen
You can't unmute someone else's presentation
More options for Nikolay Yankov
Zoom in
Open in new window
Enter Full Screen
Pin Stefka Stoyanova to your main screen
You can't unmute someone else
More options for Stefka Stoyanova
Stefka Stoyanova
Pin Nikolay Yankov to your main screen
Mute Nikolay Yankov's microphone
More options for Nikolay Yankov
Nikolay Yankov
Pin Steliyan Georgiev to your main screen
Mute Steliyan Georgiev's microphone
More options for Steliyan Georgiev
Steliyan Georgiev
Pin Nikolay Ivanov to your main screen
You can't unmute someone else
More options for Nikolay Ivanov
Nikolay Ivanov
You’re continuously framed
Backgrounds and effects
More options for Lukas Kovalik
Lukas Kovalik
Others might see more of your background. Click to view your full video.
9:55
AM
Daily - Platform
Daily - Platform
Audio settings
Turn on microphone
Video settings
Turn off camera
Nikolay Yankov is presenting
Send a reaction
Turn on captions
Raise hand (ctrl + ⌘ + h)
More options
Leave call
Meeting details
Chat with everyone
Meeting tools
Turn on microphone (⌘ + d)
Your microphone is off.
meet.google.com / Daily - Platform
FirefoxEditHistoryBookmarksProfilesToolsWindow Helpdoo.sideme.minny.com/di-teoorls~ Google Gemini.. X‹ 40 lil © I Daily - Platform • 10mleft (A 100% C2 2• Tue 14 Apr 9:55:58Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny- New laosocc ce Cdcseciie3. Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 2026{03 Ask Jiminny reportsPeriodiReport Type10• Clear all |FREQUENCYDailySHAREDDATEY09/04/9016ACTIONSare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiPro vYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageJY-18909-automated-reports-ask-iiminny = 869453)...
|
NULL
|
|
8617
|
166
|
8
|
2026-04-14T06:56:10.958149+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149770958_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsB AWSE3 SSHEl DatadogPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=PriorityStor..% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byA JY-20458 CRUD reports with Ask Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..AssigneeStatus3 Steliyan...i2 Stellyan .-.INDEYVINDEV vAnellya A...READY FOR DEV& Unassign...READY FOR DEVSDEPLOYEDDONEDONEDONEDONE MDONEDONEWUnEActions9:56 AM | Daily - Platform> 0.Daily - Platform - 9m left8•CE: DepeDependol• Far DepeD TasksInsights & CoachinD DevD ux.©2|In Dev vXz Improve Story~ Details32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1 0 •***=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 ^ ****=|Missed deletion of stalea080se(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan Georgievvalik10:468 • Tue 14 Apr 9:56:10Chat6Nikolay YankovNikolay Ivanov...
|
NULL
|
-1740625934814605516
|
NULL
|
click
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsB AWSE3 SSHEl DatadogPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=PriorityStor..% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byA JY-20458 CRUD reports with Ask Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..AssigneeStatus3 Steliyan...i2 Stellyan .-.INDEYVINDEV vAnellya A...READY FOR DEV& Unassign...READY FOR DEVSDEPLOYEDDONEDONEDONEDONE MDONEDONEWUnEActions9:56 AM | Daily - Platform> 0.Daily - Platform - 9m left8•CE: DepeDependol• Far DepeD TasksInsights & CoachinD DevD ux.©2|In Dev vXz Improve Story~ Details32 Stellyan GeorgievAsentoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1 0 •***=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 ^ ****=|Missed deletion of stalea080se(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan Georgievvalik10:468 • Tue 14 Apr 9:56:10Chat6Nikolay YankovNikolay Ivanov...
|
NULL
|
|
8618
|
167
|
18
|
2026-04-14T06:56:10.977526+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149770977_m2.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. XC< 40Wl © | Daily -Platform 9mleft A 100%Ca &• Tue 14 Apr 9:56:10Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny+ New labsocc ce Cdcseciie3. Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 2026S03 Ask Jiminny reportsPeriodiReport Type10• Clear all |FREQUENCY+DailySHAREDICCanDATEY09/04/9016PrevieiNsP C шare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageGapp.staging.jiminny.com/ai-...
|
NULL
|
1450053700510574607
|
NULL
|
click
|
ocr
|
NULL
|
FirefoxFileEoitViewHistoryBookmarksProfilesToolsWi FirefoxFileEoitViewHistoryBookmarksProfilesToolsWindow Helpdoo.sidemne.minny.com/dl-teoorls~ Google Gemini.. XC< 40Wl © | Daily -Platform 9mleft A 100%Ca &• Tue 14 Apr 9:56:10Platform Sprint 1 Q2 - Platform Tea@ Jy 19798 evaluation for ai activity t8 Jiminny+ New labsocc ce Cdcseciie3. Run: Executes the evaluation viaasyncio.run(run_evaluation(...)),which orchestrates the collection,endpoint calling, and result scoringoeses.Al Reports• Report nameNAME -Ask Jiminny Test Report - 8 Apr 2026S03 Ask Jiminny reportsPeriodiReport Type10• Clear all |FREQUENCY+DailySHAREDICCanDATEY09/04/9016PrevieiNsP C шare there soem issueswith this PRShow thinking vYes, there are several issues and code smellsin this Pull Request, ranging from performancebottlenecks to architectural anti-patterns.Here is a breakdown of the most notableissues, categorized by impact:Performance & Async Anti-Patterns• Sequential Requests Defeating Async(Lines 86-106): Inside_collect_from_es, the code loopsthrough raw_samples and await sasession.post request for each sampleone by one. This means the script waitsfor one HTTP request to finish beforestarting the next, completely defeatingthe purpose of using aiohttp forasynchronous concurrency.• Fix: Collect these coroutines into alist and use asyncio.gather ()(ideally wrapped in anasyncio. Semaphore to avoidoverwhelming the endpoint) to run• Enter a prompt for GeminiProvYour Jiminny chats aren't used to improve our models. Gemini is Alane can make mistakes, Inciualne aoout peodleYour privacy & GeminiSummarize pageGapp.staging.jiminny.com/ai-...
|
NULL
|
|
8620
|
167
|
20
|
2026-04-14T06:56:15.044426+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149775044_m2.jpg...
|
Firefox
|
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf — Work...
|
1
|
app.staging.jiminny.com/ai-reports/pdf/e2f4890a-d0 app.staging.jiminny.com/ai-reports/pdf/e2f4890a-d097-495b-8dfd-d4c25a5e8843...
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jiminny
Jiminny
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Listen
Show more options
Show thinking
Gemini said
Gemini said
Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues, categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
, the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use
asyncio.gather()
(ideally wrapped in an
asyncio.Semaphore
to avoid overwhelming the endpoint) to run the replayability probes concurrently.
Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):
Standard synchronous file operations (
with open(...)
,
json.dump
,
json.load
) are used inside
async def
functions. In Python, this blocks the entire event loop while the disk is reading/writing.
Fix:
Use an asynchronous file I/O library like
aiofiles
, or run the blocking I/O operations in a separate thread using
asyncio.to_thread()
.
Architecture & State Management
Architecture & State Management
Mutating Class-Level Variables (Lines 323-328):
The script parses CLI arguments and directly mutates the class attributes (e.g.,
AiActivityTypeEvaluator.DATA_SET = args.data_set
). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.
Fix:
Pass these arguments into the
__init__
constructor of the instance (
self.data_set = args.data_set
) rather than modifying the class definition globally.
Code Quality & Edge Cases
Code Quality & Edge Cases
Broad Exception Catching (Line 107):
Using a bare
except Exception:
during the replayability probe is dangerous. It will catch
aiohttp.ClientError
(which is intended), but it will also silently swallow
TimeoutError
or even memory errors, making debugging incredibly difficult if the endpoint goes down.
Fix:
Scope the exception to specific expected errors, like
except (aiohttp.ClientError, asyncio.TimeoutError):
.
Inefficient Size Calculation (Line 95):
len(json.dumps(request_body)) < min_size
requires serializing the entire dictionary to a string
just
to check its length. Since
min_size
defaults to
0
, the
if min_size
short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.
Potential Path Traversal Risk (Lines 110, 113, 251):
The script writes and reads files directly using
f"{es_id}.json"
and
f"gt_{req_id}.json"
. If
es_id
or
req_id
originate from a database and ever contain characters like
../
, it could result in files being read from or written to unintended directories.
Fix:
Sanitize the IDs or strictly enforce that they only contain alphanumeric characters.
Would you like me to rewrite the
_collect_from_es
method to show how to fix the sequential async loop and blocking file I/O?
Enter a prompt for Gemini
encrypted
Enter a prompt for Gemini
encrypted
Open upload file menu
Tools...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app","depth":4,"bounds":{"left":0.00234375,"top":0.045138888,"width":0.0890625,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":4,"bounds":{"left":0.0,"top":0.08263889,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":5,"bounds":{"left":0.015625,"top":0.09236111,"width":0.11796875,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":4,"bounds":{"left":0.0,"top":0.11111111,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":5,"bounds":{"left":0.015625,"top":0.12083333,"width":0.1984375,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jiminny","depth":4,"bounds":{"left":0.0,"top":0.13958333,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Jiminny","depth":5,"bounds":{"left":0.015625,"top":0.14930555,"width":0.015625,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf","depth":4,"bounds":{"left":0.0,"top":0.16805555,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf","depth":5,"bounds":{"left":0.015625,"top":0.17777778,"width":0.1640625,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.07890625,"top":0.17430556,"width":0.009375,"height":0.016666668},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.003125,"top":0.19791667,"width":0.08710937,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.003125,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.01640625,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Tabs from other devices","depth":6,"bounds":{"left":0.029296875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.0421875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.05546875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.2171875,"top":0.047916666,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.23125,"top":0.047916666,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"WORK, Google Account: lukas.kovalik@jiminny.com","depth":12,"bounds":{"left":0.228125,"top":0.090277776,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.0984375,"top":0.090277776,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Chat","depth":12,"bounds":{"left":0.1953125,"top":0.090277776,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.2109375,"top":0.090277776,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.09335937,"top":0.12847222,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.09335937,"top":0.13055556,"width":0.14101562,"height":0.022222223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":21,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":22,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Gemini said","depth":20,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This Python script defines the","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator","depth":25,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Core Functionality","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Core Functionality","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Data Collection (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"collect_test_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"* Supports fetching raw samples from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Elasticsearch (ES)","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or loading them from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"local JSON files","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Implements a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"replayability check","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":": It probes the live API endpoint to ensure the activity exists before including it in the dataset.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Filters data based on request body size, language, and the presence of an original response.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Ground Truth Management:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Saves inputs and \"gold standard\" responses to a local directory structure (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"eval_output/ai_activity_type/...","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":").","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Features an optional mode to populate ground truth using a high-order LLM (e.g.,","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"gpt-5_2","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") if the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"POPULATE_GROUND_TRUTH_LLM","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"flag is set.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Evaluation Logic (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"evaluate_against_ground_truth","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Compares the model's predicted","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ai_activity_type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"against either the original ES response or the LLM-generated ground truth.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Assigns a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"quality_score","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"of","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a match and","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a mismatch or request failure.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Key Configuration Constants","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Key Configuration Constants","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"COMPARE_TO_GROUND_TRUTH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SKIP_UNREPLAYABLE_ON_COLLECT","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"REPLAYABILITY_CHECK_TIMEOUT_SECONDS","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"call/ai-activity-type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Determines if the baseline is the original data (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") or an","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"llm","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", discards records where the API returns \"Cannot find activity.\"","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Limits the probe request to 20 seconds.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy table","depth":24,"bounds":{"left":0.22070312,"top":0.0,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Execution Workflow","depth":23,"bounds":{"left":0.103125,"top":0.0,"width":0.13320312,"height":0.016666668},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Execution Workflow","depth":24,"bounds":{"left":0.103125,"top":0.0,"width":0.060546875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Parsing:","depth":26,"bounds":{"left":0.1171875,"top":0.009027778,"width":0.024609376,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script uses","depth":26,"bounds":{"left":0.14179687,"top":0.009027778,"width":0.04609375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"create_parser","depth":27,"bounds":{"left":0.19023438,"top":0.010416667,"width":0.042578124,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to handle CLI arguments like","depth":26,"bounds":{"left":0.1171875,"top":0.009027778,"width":0.119140625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--data-set","depth":27,"bounds":{"left":0.1171875,"top":0.028472222,"width":0.107421875,"height":0.03125},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.12929687,"top":0.045138888,"width":0.003125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--populate-ground-truth-llm","depth":27,"bounds":{"left":0.13476562,"top":0.046527777,"width":0.08867188,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", and","depth":26,"bounds":{"left":0.1171875,"top":0.045138888,"width":0.11171875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--compare-to-ground-truth","depth":27,"bounds":{"left":0.13203125,"top":0.06458333,"width":0.08203125,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"bounds":{"left":0.21640626,"top":0.063194446,"width":0.0015625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Initialization:","depth":26,"bounds":{"left":0.1171875,"top":0.08680555,"width":0.039453126,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sets up directory paths for inputs and ground truths based on the selected dataset name.","depth":26,"bounds":{"left":0.1171875,"top":0.08680555,"width":0.11875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Run:","depth":26,"bounds":{"left":0.1171875,"top":0.14652778,"width":0.013671875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Executes the evaluation via","depth":26,"bounds":{"left":0.13085938,"top":0.14652778,"width":0.080078125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.run(run_evaluation(...))","depth":27,"bounds":{"left":0.11953125,"top":0.16597222,"width":0.10507812,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", which orchestrates the collection, endpoint calling, and result scoring phases.","depth":26,"bounds":{"left":0.1171875,"top":0.16458334,"width":0.11289062,"height":0.06875},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.1203125,"top":0.28194445,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.1375,"top":0.28194445,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said are there soem issues with this PR","depth":21,"bounds":{"left":0.1609375,"top":0.29027778,"width":0.072265625,"height":0.03888889},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.09335937,"top":0.2923611,"width":0.0234375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"are there soem issues with this PR","depth":23,"bounds":{"left":0.1609375,"top":0.29305556,"width":0.062109374,"height":0.034027778},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":22,"bounds":{"left":0.22382812,"top":0.35694444,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"bounds":{"left":0.22382812,"top":0.35208333,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":26,"bounds":{"left":0.11953125,"top":0.35902777,"width":0.036328126,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":20,"bounds":{"left":0.11679687,"top":0.3951389,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"bounds":{"left":0.11679687,"top":0.39652777,"width":0.048046876,"height":0.016666668},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.","depth":24,"bounds":{"left":0.10390625,"top":0.40347221,"width":0.13242188,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of the most notable issues, categorized by impact:","depth":24,"bounds":{"left":0.10390625,"top":0.46875,"width":0.11640625,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Performance & Async Anti-Patterns","depth":23,"bounds":{"left":0.10390625,"top":0.5222222,"width":0.13242188,"height":0.016666668},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Performance & Async Anti-Patterns","depth":24,"bounds":{"left":0.10390625,"top":0.5236111,"width":0.10859375,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sequential Requests Defeating Async (Lines 86-106):","depth":26,"bounds":{"left":0.11757813,"top":0.5465278,"width":0.11601563,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Inside","depth":26,"bounds":{"left":0.16289063,"top":0.56458336,"width":0.019921875,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"_collect_from_es","depth":27,"bounds":{"left":0.11992188,"top":0.58402777,"width":0.05234375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", the code loops through","depth":26,"bounds":{"left":0.11757813,"top":0.58263886,"width":0.10429688,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"raw_samples","depth":27,"bounds":{"left":0.14414063,"top":0.6020833,"width":0.0359375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and","depth":26,"bounds":{"left":0.18242188,"top":0.6006944,"width":0.0140625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"await","depth":27,"bounds":{"left":0.19882813,"top":0.6020833,"width":0.01640625,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"s a","depth":26,"bounds":{"left":0.21757813,"top":0.6006944,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"session.post","depth":27,"bounds":{"left":0.11992188,"top":0.6201389,"width":0.039453126,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using","depth":26,"bounds":{"left":0.11757813,"top":0.61875,"width":0.11640625,"height":0.08680555},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiohttp","depth":27,"bounds":{"left":0.18046875,"top":0.6923611,"width":0.023046875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for asynchronous concurrency.","depth":26,"bounds":{"left":0.11757813,"top":0.6909722,"width":0.09921875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"bounds":{"left":0.1296875,"top":0.7326389,"width":0.009765625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Collect these coroutines into a list and use","depth":28,"bounds":{"left":0.1296875,"top":0.7326389,"width":0.09921875,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.gather()","depth":29,"bounds":{"left":0.16601562,"top":0.75208336,"width":0.05234375,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(ideally wrapped in an","depth":28,"bounds":{"left":0.1296875,"top":0.75069445,"width":0.09257813,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.Semaphore","depth":29,"bounds":{"left":0.13203125,"top":0.7881944,"width":0.055859376,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to avoid overwhelming the endpoint) to run the replayability probes concurrently.","depth":28,"bounds":{"left":0.1296875,"top":0.78680557,"width":0.10546875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):","depth":26,"bounds":{"left":0.11757813,"top":0.84652776,"width":0.10976563,"height":0.03263889},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Standard synchronous file operations (","depth":26,"bounds":{"left":0.11757813,"top":0.8645833,"width":0.107421875,"height":0.050694443},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"with open(...)","depth":27,"bounds":{"left":0.12226562,"top":0.90208334,"width":0.045703124,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.1703125,"top":0.90069443,"width":0.003515625,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"json.dump","depth":27,"bounds":{"left":0.17617187,"top":0.90208334,"width":0.029296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":",","depth":26,"bounds":{"left":0.2078125,"top":0.90069443,"width":0.003125,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"json.load","depth":27,"bounds":{"left":0.11992188,"top":0.9201389,"width":0.0296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") are used inside","depth":26,"bounds":{"left":0.15195313,"top":0.91875,"width":0.048046876,"height":0.014583333},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"async def","depth":27,"bounds":{"left":0.20234375,"top":0.9201389,"width":0.0296875,"height":0.013194445},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"functions. In Python, this blocks the entire event loop while the disk is reading/writing.","depth":26,"bounds":{"left":0.11757813,"top":0.91875,"width":0.11875,"height":0.06875},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"bounds":{"left":0.1296875,"top":0.9965278,"width":0.009765625,"height":0.003472209},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Use an asynchronous file I/O library like","depth":28,"bounds":{"left":0.1296875,"top":0.9965278,"width":0.09335937,"height":0.003472209},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiofiles","depth":29,"bounds":{"left":0.1625,"top":1.0,"width":0.0265625,"height":-0.015972257},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", or run the blocking I/O operations in a separate thread using","depth":28,"bounds":{"left":0.1296875,"top":1.0,"width":0.10585938,"height":-0.014583349},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"asyncio.to_thread()","depth":29,"bounds":{"left":0.16914062,"top":1.0,"width":0.0625,"height":-0.052083373},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"bounds":{"left":0.23398438,"top":1.0,"width":0.0015625,"height":-0.050694466},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Architecture & State Management","depth":23,"bounds":{"left":0.10390625,"top":1.0,"width":0.13242188,"height":-0.08611107},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Architecture & State Management","depth":24,"bounds":{"left":0.10390625,"top":1.0,"width":0.103515625,"height":-0.087499976},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Mutating Class-Level Variables (Lines 323-328):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script parses CLI arguments and directly mutates the class attributes (e.g.,","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator.DATA_SET = args.data_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Pass these arguments into the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"__init__","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"constructor of the instance (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"self.data_set = args.data_set","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") rather than modifying the class definition globally.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Code Quality & Edge Cases","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Code Quality & Edge Cases","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Broad Exception Catching (Line 107):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Using a bare","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"except Exception:","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"during the replayability probe is dangerous. It will catch","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"aiohttp.ClientError","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(which is intended), but it will also silently swallow","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TimeoutError","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or even memory errors, making debugging incredibly difficult if the endpoint goes down.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Scope the exception to specific expected errors, like","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"except (aiohttp.ClientError, asyncio.TimeoutError):","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Inefficient Size Calculation (Line 95):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"len(json.dumps(request_body)) < min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"requires serializing the entire dictionary to a string","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"just","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"to check its length. Since","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"defaults to","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", the","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"if min_size","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Potential Path Traversal Risk (Lines 110, 113, 251):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The script writes and reads files directly using","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"f\"{es_id}.json\"","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"f\"gt_{req_id}.json\"","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". If","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"es_id","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"req_id","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"originate from a database and ever contain characters like","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"../","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", it could result in files being read from or written to unintended directories.","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Fix:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sanitize the IDs or strictly enforce that they only contain alphanumeric characters.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Would you like me to rewrite the","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"_collect_from_es","depth":25,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"method to show how to fix the sequential async loop and blocking file I/O?","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextArea","text":"Enter a prompt for Gemini\nencrypted","depth":20,"bounds":{"left":0.109375,"top":0.8354167,"width":0.125,"height":0.016666668},"value":"Enter a prompt for Gemini\nencrypted","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Enter a prompt for Gemini","depth":21,"bounds":{"left":0.1171875,"top":0.8354167,"width":0.08203125,"height":0.016666668},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"encrypted","depth":21,"bounds":{"left":0.10820313,"top":0.8354167,"width":0.0078125,"height":0.016666668},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.1046875,"top":0.86527777,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.1234375,"top":0.86527777,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false}]...
|
-1804690491028682382
|
8732321575042068887
|
visual_change
|
accessibility
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jiminny
Jiminny
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:
call/ai-activity-type
.
COMPARE_TO_GROUND_TRUTH
Determines if the baseline is the original data (
es
) or an
llm
.
SKIP_UNREPLAYABLE_ON_COLLECT
If
True
, discards records where the API returns "Cannot find activity."
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Limits the probe request to 20 seconds.
Constant
ENDPOINT_PATH
COMPARE_TO_GROUND_TRUTH
SKIP_UNREPLAYABLE_ON_COLLECT
REPLAYABILITY_CHECK_TIMEOUT_SECONDS
Purpose
The specific API route being tested:
call/ai-activity-type
.
Determines if the baseline is the original data (
es
) or an
llm
.
If
True
, discards records where the API returns "Cannot find activity."
Limits the probe request to 20 seconds.
Copy table
Execution Workflow
Execution Workflow
Parsing:
The script uses
create_parser
to handle CLI arguments like
--data-set
,
--populate-ground-truth-llm
, and
--compare-to-ground-truth
.
Initialization:
Sets up directory paths for inputs and ground truths based on the selected dataset name.
Run:
Executes the evaluation via
asyncio.run(run_evaluation(...))
, which orchestrates the collection, endpoint calling, and result scoring phases.
Copy prompt
Edit
You said are there soem issues with this PR
You said
are there soem issues with this PR
Listen
Show more options
Show thinking
Gemini said
Gemini said
Yes, there are several issues and code smells in this Pull Request, ranging from performance bottlenecks to architectural anti-patterns.
Here is a breakdown of the most notable issues, categorized by impact:
Performance & Async Anti-Patterns
Performance & Async Anti-Patterns
Sequential Requests Defeating Async (Lines 86-106):
Inside
_collect_from_es
, the code loops through
raw_samples
and
await
s a
session.post
request for each sample one by one. This means the script waits for one HTTP request to finish before starting the next, completely defeating the purpose of using
aiohttp
for asynchronous concurrency.
Fix:
Collect these coroutines into a list and use
asyncio.gather()
(ideally wrapped in an
asyncio.Semaphore
to avoid overwhelming the endpoint) to run the replayability probes concurrently.
Blocking File I/O in Async Functions (Lines 110-114, 184-188, 268-269):
Standard synchronous file operations (
with open(...)
,
json.dump
,
json.load
) are used inside
async def
functions. In Python, this blocks the entire event loop while the disk is reading/writing.
Fix:
Use an asynchronous file I/O library like
aiofiles
, or run the blocking I/O operations in a separate thread using
asyncio.to_thread()
.
Architecture & State Management
Architecture & State Management
Mutating Class-Level Variables (Lines 323-328):
The script parses CLI arguments and directly mutates the class attributes (e.g.,
AiActivityTypeEvaluator.DATA_SET = args.data_set
). This is a major anti-pattern. If you ever try to run tests in parallel or create multiple instances of this evaluator in a single process, the state will bleed across them.
Fix:
Pass these arguments into the
__init__
constructor of the instance (
self.data_set = args.data_set
) rather than modifying the class definition globally.
Code Quality & Edge Cases
Code Quality & Edge Cases
Broad Exception Catching (Line 107):
Using a bare
except Exception:
during the replayability probe is dangerous. It will catch
aiohttp.ClientError
(which is intended), but it will also silently swallow
TimeoutError
or even memory errors, making debugging incredibly difficult if the endpoint goes down.
Fix:
Scope the exception to specific expected errors, like
except (aiohttp.ClientError, asyncio.TimeoutError):
.
Inefficient Size Calculation (Line 95):
len(json.dumps(request_body)) < min_size
requires serializing the entire dictionary to a string
just
to check its length. Since
min_size
defaults to
0
, the
if min_size
short-circuit saves it for now, but if anyone sets a minimum size, this will become an unnecessary CPU bottleneck on large datasets.
Potential Path Traversal Risk (Lines 110, 113, 251):
The script writes and reads files directly using
f"{es_id}.json"
and
f"gt_{req_id}.json"
. If
es_id
or
req_id
originate from a database and ever contain characters like
../
, it could result in files being read from or written to unintended directories.
Fix:
Sanitize the IDs or strictly enforce that they only contain alphanumeric characters.
Would you like me to rewrite the
_collect_from_es
method to show how to fix the sequential async loop and blocking file I/O?
Enter a prompt for Gemini
encrypted
Enter a prompt for Gemini
encrypted
Open upload file menu
Tools...
|
NULL
|
|
8622
|
166
|
10
|
2026-04-14T06:56:40.315536+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149800315_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsB AWSE3 SSHEl DatadogPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=PriorityStor..% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byA JY-20458 CRUD reports with Ask Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..AssigneeStatus3 Steliyan...i2 Stellyan .-.INDEYVINDEV vAnellya A...READY FOR DEV& Unassign...READY FOR DEVSDEPLOYEDDONEDONE MDONEDONE MDONEDONEWUnEActions9:56 AM | Daily - Platform> 0.Daily - Platform • 9 m left8•CE: DepeDependol• Far DepeD TasksInsights & CoachinD DevD ux.©2|In Dev vXz Improve Story~ Details32 Stellyan GeorgievAsuntoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1 0 •***=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 ^ ****=|Missed deletion of stale408099(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka Stoyanovateliyan Georgievalik11:168 • Tue 14 Apr 9:56:40ChatNikolay YankovNikolay Ivanov...
|
NULL
|
-589028587563285134
|
NULL
|
visual_change
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsB AWSE3 SSHEl DatadogPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=PriorityStor..% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byA JY-20458 CRUD reports with Ask Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..AssigneeStatus3 Steliyan...i2 Stellyan .-.INDEYVINDEV vAnellya A...READY FOR DEV& Unassign...READY FOR DEVSDEPLOYEDDONEDONE MDONEDONE MDONEDONEWUnEActions9:56 AM | Daily - Platform> 0.Daily - Platform • 9 m left8•CE: DepeDependol• Far DepeD TasksInsights & CoachinD DevD ux.©2|In Dev vXz Improve Story~ Details32 Stellyan GeorgievAsuntoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1 0 •***=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 ^ ****=|Missed deletion of stale408099(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka Stoyanovateliyan Georgievalik11:168 • Tue 14 Apr 9:56:40ChatNikolay YankovNikolay Ivanov...
|
NULL
|
|
8625
|
167
|
22
|
2026-04-14T06:56:55.395589+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149815395_m2.jpg...
|
Firefox
|
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf — Work...
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jiminny
Jiminny
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app","depth":4,"bounds":{"left":0.00234375,"top":0.045138888,"width":0.0890625,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":4,"bounds":{"left":0.0,"top":0.08263889,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira","depth":5,"bounds":{"left":0.015625,"top":0.09236111,"width":0.11796875,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":4,"bounds":{"left":0.0,"top":0.11111111,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet","depth":5,"bounds":{"left":0.015625,"top":0.12083333,"width":0.1984375,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Jiminny","depth":4,"bounds":{"left":0.0,"top":0.13958333,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Jiminny","depth":5,"bounds":{"left":0.015625,"top":0.14930555,"width":0.015625,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf","depth":4,"bounds":{"left":0.0,"top":0.16805555,"width":0.09375,"height":0.028472222},"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf","depth":5,"bounds":{"left":0.015625,"top":0.17777778,"width":0.1640625,"height":0.009722223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.07890625,"top":0.17430556,"width":0.009375,"height":0.016666668},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.003125,"top":0.19791667,"width":0.08710937,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.003125,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.01640625,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Tabs from other devices","depth":6,"bounds":{"left":0.029296875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.0421875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.05546875,"top":0.97430557,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.2171875,"top":0.047916666,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.23125,"top":0.047916666,"width":0.0125,"height":0.022222223},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"WORK, Google Account: lukas.kovalik@jiminny.com","depth":12,"bounds":{"left":0.228125,"top":0.090277776,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.0984375,"top":0.090277776,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Chat","depth":12,"bounds":{"left":0.1953125,"top":0.090277776,"width":0.015625,"height":0.027777778},"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.2109375,"top":0.090277776,"width":0.015625,"height":0.027777778},"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.09335937,"top":0.12847222,"width":0.000390625,"height":0.00069444446},"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.09335937,"top":0.13055556,"width":0.14101562,"height":0.022222223},"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":21,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\\nExpected: {gt_label}\\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.","depth":23,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":22,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":20,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"Gemini said","depth":20,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":21,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This Python script defines the","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AiActivityTypeEvaluator","depth":25,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Core Functionality","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Core Functionality","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Data Collection (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"collect_test_set","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"* Supports fetching raw samples from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Elasticsearch (ES)","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or loading them from","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"local JSON files","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Implements a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"replayability check","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":": It probes the live API endpoint to ensure the activity exists before including it in the dataset.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Filters data based on request body size, language, and the presence of an original response.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Ground Truth Management:","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Saves inputs and \"gold standard\" responses to a local directory structure (","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"eval_output/ai_activity_type/...","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":").","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Features an optional mode to populate ground truth using a high-order LLM (e.g.,","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"gpt-5_2","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":") if the","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"POPULATE_GROUND_TRUTH_LLM","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"flag is set.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Evaluation Logic (","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"evaluate_against_ground_truth","depth":27,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":26,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Compares the model's predicted","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ai_activity_type","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"against either the original ES response or the LLM-generated ground truth.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Assigns a","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"quality_score","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"of","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a match and","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0.0","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"for a mismatch or request failure.","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Key Configuration Constants","depth":23,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Key Configuration Constants","depth":24,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Constant","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Purpose","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ENDPOINT_PATH","depth":29,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The specific API route being tested:","depth":28,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
5035648433098391144
|
5850017813519708295
|
click
|
accessibility
|
NULL
|
JY-20543 add AJ reports User pilot tracking by Lak JY-20543 add AJ reports User pilot tracking by LakyLak · Pull Request #11932 · jiminny/app
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Platform Sprint 1 Q2 - Platform Team - Scrum Board - Jira
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jy 19798 evaluation for ai activity types by nikolaybiaivanov · Pull Request #468 · jiminny/prophet
Jiminny
Jiminny
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf
Ask Jiminny test report - 8 Apr 2026 - Ask Jiminny test report - 13 Apr 2026.pdf
Close tab
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Tabs from other devices
Open history (⇧⌘H)
Open bookmarks (⌘B)
AI Chat settings
Close
WORK, Google Account: [EMAIL]
Main menu
New Chat
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected. Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
You said
I’m on page “<tabTitle>Jy 19798 evaluation for ai activity types by nikol</tabTitle>” with “<selection>@@ -0,0 +1,330 @@1+#!/usr/bin/env python32+import asyncio3+import json4+import os5+from pathlib import Path6+from typing import Any, Dict, List7+8+import aiohttp9+10+from src.component.call_processing.schemes.ai_activity_type_request import AiActivityTypeRequest11+from src.component.call_processing.schemes.ai_activity_type_response import AiActivityTypeResponse12+from src.eval.abstract_evaluator import AbstractEvaluator13+from src.eval.collectors import collect_test_set as es_collect_test_set14+from src.eval.common_evaluator_runner import create_parser, run_evaluation15+from src.eval.endpoints import run_endpoint as run_endpoint_http16+from src.eval.models import EvaluationResult17+18+19+class AiActivityTypeEvaluator(AbstractEvaluator):20+ENDPOINT_PATH = "call/ai-activity-type"21+REQUEST_MODEL = AiActivityTypeRequest22+RESPONSE_MODEL = AiActivityTypeResponse23+OUTPUT_SUBFOLDER = "ai_activity_type"24+USES_LLM_EVALUATION = False25+26+READS_FROM_DATASET_FOLDER = True27+28+DATA_SET: str = "initial"29+30+MINIMUM_REQUEST_BODY_SIZE: int = 031+32+SKIP_UNREPLAYABLE_ON_COLLECT: bool = True33+REPLAYABILITY_CHECK_TIMEOUT_SECONDS: float = 20.034+35+POPULATE_GROUND_TRUTH_LLM: bool = False36+GROUND_TRUTH_LLM_ENDPOINT: str = "OPENAI:gpt-5_2"37+38+COMPARE_TO_GROUND_TRUTH: str = "es"39+40+def __init__(self, **kwargs: Any) -> None:41+super().__init__(**kwargs)42+_repo_root = Path(__file__).parents[2]43+_datasets_root = _repo_root / "eval_output" / self.OUTPUT_SUBFOLDER / "eval_datasets"44+_dataset_dir = _datasets_root / self.__class__.DATA_SET45+self._dataset_dir = _dataset_dir46+self._inputs_dir = _dataset_dir / "input"47+self._ground_truths_dir = _dataset_dir / "ground_truth"48+49+async def collect_test_set(self) -> List[Dict[str, Any]]:50+if self.es_host_name:51+return await self._collect_from_es()52+return await self._collect_from_files()53+54+async def _collect_from_es(self) -> List[Dict[str, Any]]:55+if self._inputs_dir.exists() or self._ground_truths_dir.exists():56+raise FileExistsError(57+f"Dataset folder already exists: {self._inputs_dir.parent}. "58+"Choose a different --data-set name or remove the folder manually."59+ )60+61+raw_samples = await es_collect_test_set(62+es_host_name=self.es_host_name,63+es_port=self.es_port,64+endpoint_path=self.ENDPOINT_PATH,65+limit=self.limit,66+language_filter=self.language_filter,67+min_prompt_length=self.min_prompt_length,68+is_json_response=self.IS_JSON_RESPONSE,69+exclude_languages=self.exclude_languages,70+ )71+72+self._inputs_dir.mkdir(parents=True, exist_ok=True)73+self._ground_truths_dir.mkdir(parents=True, exist_ok=True)74+75+min_size = self.MINIMUM_REQUEST_BODY_SIZE76+samples: List[Dict[str, Any]] = []77+skipped_no_response = 078+skipped_too_small = 079+skipped_unreplayable = 080+skipped_probe_errors = 081+82+endpoint_url = f"{self.api_host}/{self.ENDPOINT_PATH}"83+84+timeout = aiohttp.ClientTimeout(total=self.REPLAYABILITY_CHECK_TIMEOUT_SECONDS)85+async with aiohttp.ClientSession(timeout=timeout) as session:86+for sample in raw_samples:87+es_id = sample["id"]88+request_body = sample["request_body"]89+original_response = sample.get("original_response")90+91+if not original_response:92+skipped_no_response += 193+continue94+95+if min_size and len(json.dumps(request_body)) < min_size:96+skipped_too_small += 197+continue98+99+if self.SKIP_UNREPLAYABLE_ON_COLLECT:100+try:101+async with session.post(endpoint_url, json=request_body) as resp:102+if resp.status != 200:103+body_text = await resp.text()104+if "Cannot find activity" in body_text:105+skipped_unreplayable += 1106+continue107+except Exception:108+skipped_probe_errors += 1109+110+with open(self._inputs_dir / f"{es_id}.json", "w") as f:111+json.dump(request_body, f, indent=2)112+113+with open(self._ground_truths_dir / f"gt_{es_id}.json", "w") as f:114+json.dump(original_response, f, indent=2)115+116+samples.append(117+ {118+"id": es_id,119+"request_body": request_body,120+"original_response": original_response,121+"ground_truth": original_response,122+ }123+ )124+125+print(f"Wrote {len(samples)} samples to {self._inputs_dir.parent}")126+if skipped_no_response:127+print(f"Skipped {skipped_no_response} records with no response body")128+if skipped_too_small:129+print(130+f"Skipped {skipped_too_small} records whose request body was smaller "131+f"than {min_size} characters"132+ )133+if skipped_unreplayable:134+print(f"Skipped {skipped_unreplayable} records that are not replayable (missing activity in ES)")135+if skipped_probe_errors:136+print(f"Warning: replayability probe errored for {skipped_probe_errors} records (kept those records)")137+138+if self.__class__.POPULATE_GROUND_TRUTH_LLM:139+label_results = await run_endpoint_http(140+test_set=samples,141+endpoint_url=endpoint_url,142+request_model=self.REQUEST_MODEL,143+force_llm_endpoint=self.__class__.GROUND_TRUTH_LLM_ENDPOINT,144+is_streaming_response=self.IS_STREAMING_RESPONSE,145+timeout_seconds=300.0,146+parallel_requests=self.parallel_requests,147+return_prompt_in_run=self.RETURN_PROMPT_IN_RUN,148+ )149+150+for i, r in enumerate(label_results):151+forced = self.__class__.GROUND_TRUTH_LLM_ENDPOINT152+model = forced.split(":", 1)[1] if ":" in forced else forced153+samples[i]["ground_truth_llm_model"] = model154+155+if not r.get("success", True):156+samples[i]["ground_truth_llm"] = None157+samples[i]["ground_truth_llm_reasoning"] = None158+samples[i]["ground_truth_llm_error_type"] = r.get("error_type")159+samples[i]["ground_truth_llm_error_message"] = r.get("error_message")160+continue161+162+resp = r.get("new_response") or {}163+samples[i]["ground_truth_llm"] = resp.get("ai_activity_type")164+samples[i]["ground_truth_llm_reasoning"] = resp.get("ai_activity_type_reasoning")165+166+return samples167+168+async def _collect_from_files(self) -> List[Dict[str, Any]]:169+if not self._inputs_dir.exists():170+raise FileNotFoundError(171+f"Dataset folder not found: {self._inputs_dir}. "172+"Run --steps collect first or check your --data-set name."173+ )174+175+samples: List[Dict[str, Any]] = []176+for filename in sorted(os.listdir(self._inputs_dir)):177+input_path = self._inputs_dir / filename178+gt_path = self._ground_truths_dir / f"gt_{filename}"179+180+if not gt_path.exists():181+print(f"Warning: ground truth not found for {filename}, skipping")182+continue183+184+with open(input_path, "r") as f:185+request_body: Dict[str, Any] = json.load(f)186+187+with open(gt_path, "r") as f:188+ground_truth: Dict[str, Any] = json.load(f)189+190+samples.append(191+ {192+"id": Path(filename).stem,193+"request_body": request_body,194+"original_response": None,195+"ground_truth": ground_truth,196+ }197+ )198+199+print(f"Loaded {len(samples)} samples from {self._inputs_dir.parent}")200+return samples201+202+async def evaluate_against_ground_truth(self, results: List[Dict[str, Any]]) -> List[EvaluationResult]:203+evaluation_results: List[EvaluationResult] = []204+compare_to = self.__class__.COMPARE_TO_GROUND_TRUTH205+206+for res in results:207+req_id = res["request_id"]208+is_success = res.get("success", True)209+210+if not is_success:211+evaluation_results.append(212+EvaluationResult(213+request_id=req_id,214+request_body=res["request_body"],215+new_response=res.get("new_response"),216+valid_schema=False,217+quality_score=0.0,218+quality_feedback=f"Request failed: {res.get('error_type')} - {res.get('error_message')}",219+response_time=res.get("response_time"),220+success=False,221+error_type=res.get("error_type"),222+error_message=res.get("error_message"),223+ground_truth_match=False,224+ )225+ )226+continue227+228+pred = (res.get("new_response") or {}).get("ai_activity_type")229+230+if compare_to == "llm":231+if "ground_truth_llm" not in res:232+evaluation_results.append(233+EvaluationResult(234+request_id=req_id,235+request_body=res["request_body"],236+new_response=res.get("new_response"),237+valid_schema=res.get("valid_schema", True),238+quality_score=0.0,239+quality_feedback=(240+"Missing ground_truth_llm in run results. Ensure collect step populated it and "241+"endpoints.py passes it through."242+ ),243+response_time=res.get("response_time"),244+success=True,245+ground_truth_match=False,246+ )247+ )248+continue249+gt_label = res.get("ground_truth_llm")250+else:251+gt_path = self._ground_truths_dir / f"gt_{req_id}.json"252+if not gt_path.exists():253+evaluation_results.append(254+EvaluationResult(255+request_id=req_id,256+request_body=res["request_body"],257+new_response=res.get("new_response"),258+valid_schema=res.get("valid_schema", True),259+quality_score=0.0,260+quality_feedback=f"Ground truth file not found: {gt_path}",261+response_time=res.get("response_time"),262+success=True,263+ground_truth_match=False,264+ )265+ )266+continue267+268+with open(gt_path, "r") as f:269+ground_truth_dict: Dict[str, Any] = json.load(f)270+271+gt_label = (ground_truth_dict or {}).get("ai_activity_type")272+273+matched = pred == gt_label274+275+evaluation_results.append(276+EvaluationResult(277+request_id=req_id,278+request_body=res["request_body"],279+new_response=res.get("new_response"),280+valid_schema=res.get("valid_schema", True),281+quality_score=1.0 if matched else 0.0,282+quality_feedback="PASS" if matched else f"FAIL\nExpected: {gt_label}\nGot: {pred}",283+response_time=res.get("response_time"),284+success=True,285+ground_truth_match=matched,286+ )287+ )288+289+return evaluation_results290+291+292+if __name__ == "__main__":293+parser = create_parser(AiActivityTypeEvaluator)294+parser.add_argument(295+"--data-set",296+type=str,297+default="initial",298+dest="data_set",299+help="Dataset folder under eval_output/ai_activity_type/eval_datasets/ (default: initial)",300+ )301+302+parser.add_argument(303+"--populate-ground-truth-llm",304+action="store_true",305+dest="populate_ground_truth_llm",306+ )307+parser.add_argument(308+"--ground-truth-llm-endpoint",309+type=str,310+default="OPENAI:gpt-5_2",311+dest="ground_truth_llm_endpoint",312+ )313+314+parser.add_argument(315+"--compare-to-ground-truth",316+type=str,317+choices=["es", "llm"],318+default="es",319+dest="compare_to_ground_truth",320+ )321+322+args = parser.parse_args()323+AiActivityTypeEvaluator.DATA_SET = args.data_set324+325+AiActivityTypeEvaluator.POPULATE_GROUND_TRUTH_LLM = bool(args.populate_ground_truth_llm)326+AiActivityTypeEvaluator.GROUND_TRUTH_LLM_ENDPOINT = args.ground_truth_llm_endpoint327+328+AiActivityTypeEvaluator.COMPARE_TO_GROUND_TRUTH = args.compare_to_ground_truth329+330+asyncio.run(run_evaluation(AiActivityTypeEvaluator, args))</selection>” selected.
Please summarize the selection using precise and concise language. Use headers and bulleted lists in the summary, to make it scannable. Maintain the meaning and factual accuracy.
Expand
Listen
Show more options
Gemini said
Gemini said
This Python script defines the
AiActivityTypeEvaluator
class, a specialized tool for evaluating AI-driven activity type classification. It extends a base evaluation framework to handle data collection, ground-truth comparison, and performance scoring.
Core Functionality
Core Functionality
Data Collection (
collect_test_set
):
* Supports fetching raw samples from
Elasticsearch (ES)
or loading them from
local JSON files
.
Implements a
replayability check
: It probes the live API endpoint to ensure the activity exists before including it in the dataset.
Filters data based on request body size, language, and the presence of an original response.
Ground Truth Management:
Saves inputs and "gold standard" responses to a local directory structure (
eval_output/ai_activity_type/...
).
Features an optional mode to populate ground truth using a high-order LLM (e.g.,
gpt-5_2
) if the
POPULATE_GROUND_TRUTH_LLM
flag is set.
Evaluation Logic (
evaluate_against_ground_truth
):
Compares the model's predicted
ai_activity_type
against either the original ES response or the LLM-generated ground truth.
Assigns a
quality_score
of
1.0
for a match and
0.0
for a mismatch or request failure.
Key Configuration Constants
Key Configuration Constants
Constant
Purpose
ENDPOINT_PATH
The specific API route being tested:...
|
NULL
|
|
8627
|
166
|
12
|
2026-04-14T06:56:59.495129+00:00
|
/Users/lukas/.screenpipe/data/data/2026-04-14/1776 /Users/lukas/.screenpipe/data/data/2026-04-14/1776149819495_m1.jpg...
|
NULL
|
NULL
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsB AWSE3 SSHEl DatadogPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=PriorityStor..% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byJY-20458 CRUD reports with Ask. Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..AssigneeStatus3 Steliyan...i2 Stellyan .-.INDEYVINDEV vAnellya A...READY FOR DEV& Unassign...READY FOR DEVSDEPLOYEDDONEDONEDONEDONE MDONEDONEWUnEActions9:56 AM | Daily - Platform•10> 0.Daily - Platform • 9 m left8•CE: DepeDependol• Far DepeD Tasks2 Insights & CoachinD DevD ux.©2|In Dev vXz Improve Story~ Details32 Stellyan GeorgievAsuntoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1 0 •***=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 П ****=Missed deletion of stale0 m400R 4.0000(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan Georgievalik11:358 • Tue 14 Apr 9:56:59ChatNikolay YankovNikolay Ivanov...
|
NULL
|
8266537810111475323
|
NULL
|
click
|
ocr
|
NULL
|
DiaFilelEditViewTabsBookmarksHistoryExtensionsWind DiaFilelEditViewTabsBookmarksHistoryExtensionsWindowHelpmeet.google.com / Daily - PlatformNikolay Yankov (Presenting)Inbox (2)PlactormSRD-677C Jy 19798(7 Service-oir Font. AmrCostI Clhttps://jiminny.atlassian.net/jira/software/c/projects/JY/boards/377selectedissue=JY-18909• Siminm0 ProjectsB AWSE3 SSHEl DatadogPlatform TeamQ Search board200$ Jy-19240 / [ JY-18900READY FOR DEV 1oohesnetixwirtexceed Fontawesomepackage limits(MAINTENANCISReady for Dev8-20564 1 П ****=PriorityStor..% JY-20574 [AI) Ensure PDF formatting is good= Medium= Medium% JY-20586 (QA) Manual Testing= Medium% JY-20626 Integration testing= MediumLinked work itemsis cloned byJY-20458 CRUD reports with Ask. Jiminnyis tested byJY-20649 Generate Ask Jiminny Report and display it in AI ReportsJY-20656 Expired Ask Jiminny report template is automatically disabled an.JY-20660 Non-creator cannot delete generated Ask Jiminny report from A…JY-20651 Generated Ask Jiminny Report PDF displays Jiminny logo when n...JY-20655 Generated Ask Jiminny Report emall is sent to recipients0 JY-20654 Generated Ask Jiminny Reoort is visible onty to creator and shar...JY-20650 Generated Ask Jiminny Report PDF includes branding, report se..AssigneeStatus3 Steliyan...i2 Stellyan .-.INDEYVINDEV vAnellya A...READY FOR DEV& Unassign...READY FOR DEVSDEPLOYEDDONEDONEDONEDONE MDONEDONEWUnEActions9:56 AM | Daily - Platform•10> 0.Daily - Platform • 9 m left8•CE: DepeDependol• Far DepeD Tasks2 Insights & CoachinD DevD ux.©2|In Dev vXz Improve Story~ Details32 Stellyan GeorgievAsuntoeReporter2 Galya DimitrovaUeyele encne@ Open with VS Code|3 branches32 commits1 pull request1 build failingRelessesProduction2 See all deploymentsComponentsPlatformSub-ProductAoorcons5 days agcD Al BookmarksGroup: QueriesDEPLOY &leature fiag releaseDeployed0 -20585 15 * ****=AI Review - Q1CTNMEUANANORCORDeployed_19982 1 0 •***=No such column "Type' onentty 'Event"PLATTORM STABTUTTYDeployed9 -20384 05 П ****=Missed deletion of stale0 m400R 4.0000(Pipedrive] DI > status notlabeled, date data missingREDUCE CHURNDeployed0 -20614 05 П ****=:Stefka StoyanovaSteliyan Georgievalik11:358 • Tue 14 Apr 9:56:59ChatNikolay YankovNikolay Ivanov...
|
NULL
|