MyPetGoat t1_jdk8icb wrote on March 25, 2023 at 12:10 AM

Reply to comment by simmol in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

You’d need the model to be running all the time observing what you’re doing on the computer. Could be done

simmol t1_jdkd4pf wrote on March 25, 2023 at 12:45 AM

Seems quite inefficient though. Can't GPT just access the HTML or other type of codes associated with the website and just access the websites via the text as opposed to image?