Ask HN: Best way to extract web element coordinates from an image?
I'm building a QA testing agent and everything works, except for having somewhat under par accuracy in resolving web elements from an image. For example I overlayed various types of grids onto an image which helped AI locate with better accuracy, but it was always still off the mark.Is there some service I can use that will give me accurate coordinates of a web element on a viewport from an image?I built a small DOM transformation tool to feed in a minimal DOM to my AI as context and it works very well, but things get dicey with iframes and some other things. Then I pivoted to using screenshots in base64 which has a much lower token count, but I can't solve my accuracy problem. Comments URL: https://news.ycombinator.com/item?id=42570167 Points: 1 # Comments: 0

I'm building a QA testing agent and everything works, except for having somewhat under par accuracy in resolving web elements from an image. For example I overlayed various types of grids onto an image which helped AI locate with better accuracy, but it was always still off the mark.
Is there some service I can use that will give me accurate coordinates of a web element on a viewport from an image?
I built a small DOM transformation tool to feed in a minimal DOM to my AI as context and it works very well, but things get dicey with iframes and some other things. Then I pivoted to using screenshots in base64 which has a much lower token count, but I can't solve my accuracy problem.
Comments URL: https://news.ycombinator.com/item?id=42570167
Points: 1
# Comments: 0
What's Your Reaction?






