2

My code works as python file but I am struggling to make it work using pyscript.I am sharing the code which I tried.

main.py

import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"Tesseract-OCR\tesseract.exe"
list_img = []
import os
import cv2
import pytesseract
import pandas as pd
list_img = []
def fun_data(x):
    image = cv2.imread(list_img[x],0)
    thresh = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    data = pytesseract.image_to_string(thresh, lang='eng',config='--psm 6')
    data = "\n".join([ll.rstrip() for ll in data.splitlines() if ll.strip()])
    data = data.split('\n')
    return data
def my_fun():
    directory = f'SQL_NOTES\\'
    file_names = os.listdir(directory)
    for file_name in file_names:
        if file_name.startswith("imagename"):
            list_img.append(directory + file_name)
    NumbersDict = dict({0 : 'list_img[0]', 1 : 'list_img[1]', 2 : 'list_img[2]', 3 : 'list_img[3]'})

    s = ([fun_data(a) for a in NumbersDict])
    return pd.DataFrame(s).T
print(my_fun())

index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <title>Empty Grass</title>

    <!-- Recommended meta tags -->
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width,initial-scale=1.0">

    <!-- PyScript CSS -->
    <link rel="stylesheet" href="https://pyscript.net/releases/2025.7.3/core.css">

    <!-- This script tag bootstraps PyScript -->
    <script type="module" src="https://pyscript.net/releases/2025.7.3/core.js"></script>
</head>
<body>
<button id="ex1">Introduction</button>
    <script type="py" config="./pyscript.toml" terminal>
from pyscript import when
@when("click", "#ex1")
def click_handler(event):
    from main import my_fun
    my_fun()
    </script>
</body>
</html>

pyscript.toml


packages = [ "pytesseract","opencv-python","pandas"]

[files]
"main.py" = "main.py"

The code extracts text from all images and will display like dataframe.I am mostly confused with rewriting pytesseract.pytesseract.tesseract_cmd = r"Tesseract-OCR\tesseract.exe" and directory = f'SQL_NOTES\'

6
  • pyscript is not going to be able to execute a cleint side exe due to sandboxing. Commented Aug 21 at 15:51
  • 2
    PyScript has the same security restrictions as JavaScript. It can't read files directly, so you can't use os.listdir() Commented Aug 21 at 16:25
  • based on documentation - PyScript and filesystems - PyScript - it may need to mount() local folder to virtual file system and it works only on Chrome/Chromium. But I don't know if it allows to run programs .exe on local system. It may need to create full www server which runs .exe on server side and browser side sends data to server and gets results. Commented Aug 21 at 17:27
  • Pyscript has working directory pyodide which restrictions the local files.Is it possible to have access to pyodide directory,drop my file there and make use of it? Another way to just upload python file and make it work on browser. Commented Aug 21 at 20:21
  • 1
    how do you expect native binaries (shared objects/dlls) to run in a web page? pyscript cannot run an exe as a subprocess nor can it use a native OpenCV install, not even if it's a python module. stackoverflow.com/questions/72197753/… and the same applies to pytesseract Commented Aug 21 at 20:25

3 Answers 3

1

That is impossible.

PyScript cannot start a subprocess to execute a tesseract executable. That excludes pytesseract.

PyScript cannot use native binary libraries (shared objects, DLLs). That excludes OpenCV.

Sign up to request clarification or add additional context in comments.

Comments

0

Pyscript can't run external program .exe because browsers don't allow for this for security reason - hacker could use this to get your passwords or other private data.

On some browsers it may not even have access to local files. For the same reason.

PyScript doc: PyScript and filesystems


It needs to use www server which will get information from browser, runs tesseract on server side and sends back results.

Popular method is to use Flask or FastAPI for this.

Probably this method (but hidden) is uses by Jupyter, Dash, Bokeh, Streamlit, Gradio, Panel, NiceGUI, Shiny,


Here example for NiceGUI

It can even display DataFrame as table ui.table.from_pandas(df)

from nicegui import ui
import os
import cv2
import pytesseract
import pandas as pd

# I don't need it on my computer
#pytesseract.pytesseract.tesseract_cmd = r"Tesseract-OCR\tesseract.exe"

def fun_data(path):
    image = cv2.imread(path, 0)

    thresh = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

    data = pytesseract.image_to_string(thresh, lang='eng',config='--psm 6')
    data = "\n".join([ll.rstrip() for ll in data.splitlines() if ll.strip()])
    data = data.split('\n')

    return data

def my_fun():
    directory = 'SQL_NOTES'
    directory = '.'  # my folder with images

    images = []

    for file_name in os.listdir(directory):
        if file_name.startswith("image"):
            images.append( os.path.join(directory, file_name) )

    result = [fun_data(a) for a in images]

    return pd.DataFrame(result).T

def click_handler(e):
    result = my_fun()

    result_label.set_text(result.to_string())  # display as text

    ui.table.from_pandas(result)   # display as table


# --- GUI on page ---

ui.button("Press to run", on_click=click_handler)

result_label = ui.label("Waiting for result...")

ui.run()

BTW: It can also use ui.upload to upload file by user(s).

from nicegui import ui
import os
import cv2
import pytesseract
import pandas as pd
import numpy as np

# I don't need it on my computer
#pytesseract.pytesseract.tesseract_cmd = r"Tesseract-OCR\tesseract.exe"

def upload_handler(e):

    file_bytes = e.content.read()
    array = np.frombuffer(file_bytes, np.uint8)
    image = cv2.imdecode(array, 0)

    thresh = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

    data = pytesseract.image_to_string(thresh, lang='eng',config='--psm 6')
    data = "\n".join([ll.rstrip() for ll in data.splitlines() if ll.strip()])
    data = data.split('\n')

    result = [data]

    df = pd.DataFrame(result).T

    result_label.set_text(df.to_string())
    ui.table.from_pandas(df)

# --- GUI on page ---

ui.upload(on_upload=upload_handler)

result_label = ui.label("Waiting for result...")

ui.run()

Comments

-1

I was able to make use tesseract package and Open-cv using the following code it used to work and the next option was with the jinga front end with desktop.

<html>
    <head>
      <link rel="stylesheet" href="https://pyscript.net/alpha/pyscript.css" />
      <script defer src="https://pyscript.net/alpha/pyscript.js"></script>
      <py-config>
          - autoclose_loader: true
          - runtimes:
            - src: "https://cdn.jsdelivr.net/pyodide/dev/full/pyodide.js"
              name: pyodide-dev
              lang: python
    </py-config>
    
      <py-env>
    - numpy
    - pandas
    - pytesseract
    - opencv-python
      </py-env>
    </head>

  <body>
    <py-script>
    import numpy as np
    import pandas as pd
    import pytesseract
    import os
    pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
    
    </py-script>
  </body>
</html>

3 Comments

It was supposed to work with desktop with jinja front end but I was connected with pyscript and the where getting everything to browser youtu.be/xLcw3if1eGE?si=Qw8C1SQracckhWdK
this code makes no sense. It imports some modules but it doesn't run my_fun() nor fun_data()
this code use https://pyscript.net/alpha/pyscript.js but this url doesn't exists any more. It only displays HTML with message Not Found

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.