Webcam Captures Text and Stores in MySQL Database

While visiting my family recently, I saw my dad entering numbers from each of the 5-8 ticket receipts he receives daily to keep track of the work he’s done, report for payroll, etc.  I knew there had to be an easier way to collect this information without having to key each ticket manually or without using a clunky, slow scanner.  After a bit of research, I found an API for OCR from Haven OnDemand and I wrote a simple script to use the camera on his laptop to snap pictures of the tickets, scrape the text and position of the text from the tickets, store it all in a MySQL database, and retain the image of the tickets in a digital archive.

Demo: Snapping image via webcam and storing text

The script itself is actually very simple:

<?php
$con=mysqli_connect(localhost,<user>,<pw>,<db>);

$name = date('Y-m-d_H:i:s');
$newname="images/".$name.".jpg";
$file = file_put_contents( $newname, file_get_contents('php://input') );
if (!$file) {
 print "Unable to write image to directory.";
 exit();
}
else
{
 $filePath = 'http://' . $_SERVER['HTTP_HOST'] . dirname($_SERVER['REQUEST_URI']) . '/' . $newname;
$result_json = file_get_contents("https://api.idolondemand.com/1/api/sync/ocrdocument/v1?apikey=<dedacted>&url=$filePath&mode=scene_photo");
 
 $json_a=json_decode($result_json,true);
 $result_left=0;
 $result_top=0;
 $result_widht=0;
 $result_height=0;
 
 foreach($json_a[text_block] as $p){
 $result_text=htmlspecialchars($p[text]);
 $result_left=$p[left];
 $result_top=$p[top];
 $result_width=$p[width];
 $result_height=$p[height];

 $sql="insert into image (name,pxleft,pxtop,pxwidth,pxheight,result) values ('$name','$result_left','$result_top','$result_width','$result_height','$result_text')";
 $result=mysqli_query($con,$sql);
 $value=mysqli_insert_id($con);
 }
}

print "$filePath\n";
?>

While the script works well, the API isn’t great at picking up small text or parsing large amounts of text.  It lacks the accuracy (only about 80% accurate) needed to confidently rely on its interpretation of the text.  To compound the issues with the API, the images from the webcam are low quality – shaky hands, varying lighting, etc. drop the quality of the images I’m trying to scrape text from.

This will be a project I keep playing with to improve the accuracy and speed of snapshots and storage.  Until I can figure out a way to improve the accuracy, though, this isn’t very practical.

Advertisements