Skip to main content
Data Science Wizardry Blog by Attila Vajda

Blog article driven learning

Probability measurement's task is to introduce a measurement, that measures uncertainty numerically and based on this, the working out of mathematical methods, with which the probability of certain events, mass random events can be calculated.

Probability Waltz: Imagine uncertainty as a dance, where each step is a numerical measurement, gracefully leading us to the mathematical ballroom of event probabilities.

Visualising ideas with OpenDalleV1.1-GPU-Demo Alt Text

Please write a minimal Python script to return all the synonyms of words in a string.
import nltk
nltk.download('wordnet')
nltk.download('punkt')
from nltk.corpus import wordnet

def get_synonyms(string):
    words = nltk.word_tokenize(string)
    synonyms = set()

    for word in words:
        synonyms.update(get_synonyms_of_word(word))

    return list(synonyms)

def get_synonyms_of_word(word):
    synonyms = set()
    for syn in wordnet.synsets(word):
        for lemma in syn.lemmas():
            synonyms.add(lemma.name())
    return synonyms

string = "happy beautiful sad"
print(get_synonyms(string))

What is the chosen quantity when we observe the throwing of a die? The most common chosen quantity is the thrown number. It can be 1, 2, 3, 4, 5 or 6, and can have different properties.

VALÓSZÍNŰSÉGSZÁMÍTÁS, kísérlet, Some of the ideas over the course of a day can be made into Python mvps. This is a fun feature of learning Python, I develop the ability to create my own tools. Realising ideas can be approached as a puzzle, instead of postponing or discarding ideas for being too difficult or far fetched, I attempt to create them. For example: "This text is an image, I can't select and copy the text, I wish I could paste it to computer agent." Alt Text

Please create a Python script for converting a photographed document to text.

Now, I have a bunch of Python code, and I can stop to figure it out, as a code puzzle.

import os
import sys
import pytesseract
from PIL import Image
from openpyxl import Workbook

# Set the path for pytesseract
pytesseract.pytesseract.tesseract_cmd = r'<path_to_tesseract_executable>'

def ocr_core(filename):
    """
    This function will handle the core OCR processing of images.
    """
    text = pytesseract.image_to_string(Image.open(filename))
    return text

def save_text_to_excel(text, output_file):
    """
    This function will save the output text to an Excel file.
    """
    # Initialize a new Workbook object
    wb = Workbook()

    # Select the active worksheet
    ws = wb.active

    # Split the text into lines and insert them into the worksheet
    for line in text.split('\n'):
        ws.append([line])

    # Save the Workbook to an Excel file
    wb.save(output_file)

if __name__ == '__main__':
    if len(sys.argv) != 3:
        print("Usage: python image_to_text.py <path_to_image> <output_excel_file>")
        sys.exit(1)

    # Get the input image file name and output Excel file name from the command line arguments
    image_file = sys.argv[1]
    output_file = sys.argv[2]

    # Perform OCR on the input image
    text = ocr_core(image_file)

    # Save the output text to an Excel file
    save_text_to_excel(text, output_file)

    print(f"Successfully converted {image_file} to {output_file}")

At first this code snippet is to me still a bit intimidating. Then I copy it into VS Code, and see that I don't need an excel file, so I delete those parts. Running the program, I get an error, consult the computer agent, install the necessary libraries, and try again. I experience some frustration at receiving error messages, but I remember Allen B. Downey's book, and realise that errors are valuable parts of the scientific method.

import os
import sys
import pytesseract
from PIL import Image

def ocr_core(filename):
    """
    This function will handle the core OCR processing of images.
    """
    text = pytesseract.image_to_string(Image.open(filename))
    return text


if __name__ == '__main__':
    if len(sys.argv) != 2:
        print("Usage: python image_to_text.py <path_to_image>")
        sys.exit(1)

    image_file = sys.argv[1]

    print(ocr_core(image_file))

    print(f"Successfully converted {image_file}")

This is very cool! I run the program and get the output:

attila@amp convimg_to_text % python3 conv_img_txt.py ~/Desktop/valoszinusegszfeladata.png
A valosziniségszamitas feladata olyan mérték bevezetése, amely
a bizonytalansagot numerikusan méri, s erre alapozva olyan mate-
matikai médszerek kidolgozdsa, amelyekkel bizonyos események
(véletlen tsmegjelenségek) valdszintisége kisz4mithato.

Successfully converted /Users/attila/Desktop/valoszinusegszfeladata.png

This is satisfying, and contributes to a sense of agency in the world.

Deterministic and non-deterministic, or stochastic model. We set certain boundaries, and use mathematical methods to define a phenomenon, a model, so we can predict the goodness of the model by observing it.

When throwing a fair die, the model is the number of possible outcomes: Q={1, 2, 3, 4, 5, 6}

Is model equal to event space then? Q = {A1, A2, A3, A4, A5, A6}

A recurring thought is that I try to bring the concept of model used as llm, and model in statistics.

a model #

How to express a 'model' in mathematical notation, when observing coin tosses?

Manipulatives for learning probability. Fair die, cards, coin, journal, pencils. Is 'throwing a fair die, each number has a 1/6 probability chance to be the outcome' a model?

A model is a conjecture.

It's interesting that learning, now feels a bit like being an evolving intelligence.

I don't yet have a clear understanding of model. Mathematical model

I asked on Discord. Keeping a Discord window open for a task is very helpful in learning. Even if self-learning, and answering my own questions, sharing ideas I find. Stack Exchange encourages writing our own answers, to be self-learners!

I can now

It was fun that I could figure out, how to set widths to the images on this page. Adjusting the css file, by giving widths to img elements.

img {
	max-width: 100%;
	height: auto;
}

I am happy that Front End skills come in handy.

What is a model?

I can now create a probability model. A few hours ago I did not know what a probability model was. I asked for help on Discord.

PROBABILITY MODEL

My first ever probability model: Probability model of coin toss outcomes

6 visualisations of probability models of coin tosses.

I was very happy to have this visualisation, this is something to further develop, exploring

matplotlib visualisation of coin toss probability model

import random

print(sum(random.choice('HT') == 'H' for _ in range(10**6))/10**6)
import random

num_trials = 1000000
num_heads = 0

for i in range(num_trials):
    coin_toss = random.choice(['H', 'T'])
    if coin_toss == 'H':
        num_heads += 1

probability_heads = num_heads / num_trials

print(probability_heads)

how y^3 becomes 3y^2(dy/dx) #

I tried to solve this problem:

Please show how
x^3 + x + y^3 + 3y = 6
becomes 
3x^2 + 1 + 3y^2(dy/dx) + 3(dy/dx) = 0
and
Please use `u` to substitute,  and solve for `dy/du`, `du/dx`,
and show dy/dx = dy/du * du/dy

What is the chain rule in simple terms? Calculus Made Easy, by Silvanus Thompson, edited by Marting Gardner is an excellent, and fun book I started reading and doing. d is a little bit of something,

x^3 + x + y^3 + 3y = 6

The chain rule is Thompson's dodge, whereby we represent a complex part of the equation with a single symbol. This is similar to using functions in Python.

x^3 + x + y^3 + 3y = 6
u + y^3 + 3y = 6
u = -y^3 - 3y - 6
y^3 + 3y = 6 - u
y^3 + (3y^2)dy + 3ydy^2 + dy^3 + 3(y + dy) = 6 - u + du
y^3 + (3y^2)dy + 3y + 3dy = - u + du
(3y^2)dy + dy = du
dy(3y^2 + 1) = du
dy/du = 1/(3y^2 + 1)

u = x^3 + x
u + du = x^3 + 3x^2dx + 3xdx^2 + dx^3 + x + dx
u + du = x^3 + 3x^2dx + x + dx
du = 3x^2dx + dx
du = dx(3x^2 + 1)
du/dx = 3x^2 + 1

dy/dx = dy/du * du/dx
dy/dx = 1/(3y^2 + 1) * (3x^2 + 1)
dy/dx = (3x^2 + 1)/(3y^2 + 1)

I couldn't figure out how the original equation results, but my reasoning seems smooth, pieces seem to fit together. I did this by following the "useful dodge" idea from the Thompson book. The chain rule, is that we change a piece of complex code to a simple one. Chunking, modularising.

I opened a Math Exchange question for this question, and got insight, collaboration is truly awesome. I used to be terrified of collaborating. After learning about the power of collaboration in learning by Jo Boaler, I started doing it in baby steps, slowly dissolving the shyness, fear and embarrasement, tension, that I experienced at interacting. Now I am at ease, and think of online collaboration as one of the best ways to learn.

Questions can also be formulated as Unix style mvps. Whatever question I have, I can note it in Obsidian, practice the TWR writing techniques on it, further develop it. Then this question can be posted on Stack Exchange sites, or Discord. Or I could just ask an imperfect question as it is. If there is no response I can just collaborate with myself by self-learning, and by doing research. Collaboration with a computer agent is associated with high achievement in mathematics. This is something I do extensively. Even if I get imperfect responses, I can be at my guard and practice critical thinking.

One beautiful activity I found is to ask computer agent to put abstract, inaccessible terms into analogies, or lighthearted, accessible expressions. For example, ChatGPT can write excellent stories in the realm of the Sound Ocean, where Martin Gardner explains partial differentiation to Melody, the curious mathematician explorer, in pebbles, sea creatures, and ocean waves.

I understood, from analogies, that sin can be the distance of a ferry wheel passenger from the ground.

Fermat's Last Theorem could be called Chasing an Elusive Unicorn, while Euclidean distance can be called the sunflower, cozy quotient, or raindrop ripple.

What is the concept of writing functions with one name, and using the function with this name? For example `def f(x): ...`   ... is something complex, and we then use `f()` instead of having to write the whole thing

I am practicing my prompt engineering abilities. Defining functions to encapsulate functionality,

patterns that repeat #

I collaborated with my sister on making repeating patterns. The Harold Jacobs book has a chapter on those, and I think the DAoM books have tasks/explorations on it.

Repeating patterns Escher template