Prompt Engineer

Just as we needed to learn the techniques of searching engines back then, we also need to learn some techniques for communicating with AI, providing reasonable and sufficient constraints, and efficiently obtaining the answers we need.

If you look at it from a different angle, current AI is like a very good student with excellent memory – it has the ability to memorize everything. What we need to do is learn how to communicate with AI correctly, effectively, and precisely describe our needs to help AI generate the expected results.

##科普 The incredibly popular AI – specifically Generative Pre-Training – literally translates to generative pre-training. It’s a deep learning model that trains text generation based on internet-available data, used for tasks like question answering, text summarization generation, machine translation, classification, code generation, and conversational AI. Currently, there have been various versions of models such as GPT-1, GPT-2, GPT-3, and GPT-4, each larger and more powerful than the previous one.

Does it Really Have Intelligence?

  • Similarity is high, accuracy increases accordingly.
  • Basic, repetitive tasks, after specific training, no longer require human intervention.

Generative AI is a technology that utilizes existing data such as text, audio, and images to create new content. It can be used for various tasks including text generation, speech synthesis, image generation, and dialogue systems. The logical reasoning of Generative AI depends on its training data and model structure. Generally, Generative AI can follow grammar, logic, and common sense to a certain extent, but it may also produce errors, biases, or unrealistic content. Therefore, the output of Generative AI requires human judgment and verification; it should not be blindly trusted or used.

Prompt Engineer

Time flows only forward, and people need to learn to adapt to the current. You can think of AI as not intelligent, lacking logic, and often producing unusable code.

If you shift your perspective, the current AI is like a child with excellent memory – it has the ability to memorize after reading something repeatedly, much like copying homework. What we need to do is learn how to communicate correctly, effectively, and with AI, precisely describing our needs to help AI generate the desired results.

Dialogue Patterns

Two years ago, when GitHub Copilot was released, no one could have imagined that two years later, OpenAI would appear out of nowhere, allowing humans to recognize the capabilities of large language models.

Based on comment-based programming and conversational programming, the interaction logic is completely different. The dialogue pattern is friendly for novice users, and it’s necessary to mention NewBing providing follow-up prompts after each question. Microsoft is trying to guide users to obtain more content from the AI knowledge base.

  • Example Scenario Scene: A long high-speed train journey, coding is a good choice. With limited traffic, processing a few images, obtaining image materials, downloading software to process images is a bit extravagant, and network conditions don’t allow it. Requirement: There are a batch of picture materials with transparent background images used as navigation bars for small programs. The selected state is colored, and the unselected state is black and white. The materials are all in color, and they need to be preprocessed. Prompt: Write a python code snippet to read all pictures from a folder, convert the pictures into black and white, maintain the transparent background, save the pictures, optimize the image quality, reduce file size, and display it in the console, comparing the sizes before and after optimization.
# Import necessary libraries
import os
from PIL import Image

# Define a function to convert images to black and white, maintain transparency, save images, and return file sizes
def convert_and_save(image_file):
    # Open the image
    image = Image.open(image_file)
    # Get the image mode, if it's RGBA mode, indicating a transparent background
    mode = image.mode
    if mode == "RGBA":
        # Create a white background image with the same size as the original image
        background = Image.new("RGB", image.size, (255, 255, 255))
        # Paste the original image onto the white background, ignoring transparent pixels
        background.paste(image, mask=image.split()[3])
        # Convert the composite image to grayscale mode
        gray_image = background.convert("L")
        # Convert the grayscale image back to RGBA mode to preserve transparency
        final_image = gray_image.convert("RGBA")
    else:
        # If it's not RGBA mode, directly convert the image to grayscale mode
        final_image = image.convert("L")
    # Get the original image file name and extension
    file_name, file_ext = os.path.splitext(image_file)
    # Define the new image file name with _bw suffix for black and white
    new_file_name = file_name + "_bw" + file_ext
    # Save the new image and optimize quality, reduce file size
    final_image.save(new_file_name, optimize=True)
    # Get the original and new image file sizes and return
    old_size = os.path.getsize(image_file)
    new_size = os.path.getsize(new_file_name)
    return old_size, new_size

# Define a folder path to store all images
folder_path = "images"
# Iterate through all files in the folder
for file in os.listdir(folder_path):
    # Concatenate the full file path
    file_path = os.path.join(folder_path, file)
    # Check if the file is an image file based on extension (can add or modify extensions as needed)
    if file_path.endswith((".png", ".jpg", ".jpeg", ".gif")):
        # Call the function to convert and save the image, and get the file size
        old_size, new_size = convert_and_save(file_path)
        # Display the comparison of file sizes before and after optimization in the console
        print(f"{file}: {old_size} bytes -> {new_size} bytes")

AI quickly provided an answer, looked very beautiful, and worked well, experienced programmers would find that the code below was not very generic, and couldn’t be put into production:

  • Fixed parameters
  • No log files, all information is printed to the console
  • The task of image processing is suitable for a multi-threaded model NewBing appeared, based on the prompt, chose a question: What can be improved in this code? The answer given this time was far beyond expectations, and most programmers would think of optimization measures, AI gave corresponding suggestions.
This code may still have some improvements, such as:

You can use the argparse library to allow users to specify the folder path and extension through command-line parameters instead of hardcoding them in the code.
You can use the logging library to record the running status of the program instead of printing it directly to the console.
You can use try-except statements to catch potential exceptions, such as file opening failure, image conversion failure, etc., and handle them

## Let's illustrate with an example

```python
# Import necessary libraries
import argparse
import logging
import multiprocessing
import os

from PIL import Image


# Define a function to convert and save the image, maintaining transparent background and optimizing file size
def convert_and_save(image_file):
    # Open the image
    try:
        image = Image.open(image_file)
    except Exception as e:
        logging.error(f"Failed to open {image_file}: {e}")
        return None, None
    # Get the image mode; if it's RGBA, it has a transparent background
    mode = image.mode
    if mode == "RGBA":
        # Create a white background image with the same size
        background = Image.new("RGB", image.size, (255, 255, 255))
        # Paste the original image onto the white background, ignoring transparent pixels
        background.paste(image, mask=image.split()[3])
        # Convert the combined image to grayscale mode
        gray_image = background.convert("L")
        # Convert the grayscale image back to RGBA mode to preserve the transparent background
        final_image = gray_image.convert("RGBA")
    else:
        # If it's not RGBA, directly convert the image to grayscale mode
        final_image = image.convert("L")
    # Get the original image file name and extension
    file_name, file_ext = os.path.splitext(image_file)
    # Define the new image file name with _bw suffix for black and white
    new_file_name = file_name + "_bw" + file_ext
    # Save the new image, optimizing quality to reduce file size
    try:
        final_image.save(new_file_name, optimize=True)
    except Exception as e:
        logging.error(f"Failed to save {new_file_name}: {e}")
        return None, None
    # Get the original image and new image file sizes, and return
    old_size = os.path.getsize(image_file)
    new_size = os.path.getsize(new_file_name)
    return file_name, old_size, new_size

# Define a function to parse command-line arguments and return the folder path and extension list
def parse_args():
    # Create an argument parser object
    parser = argparse.ArgumentParser(description="Convert images to black and white and optimize quality.")
    # Add a required positional argument, representing the folder path
    parser.add_argument("folder_path", help="The path of the folder that contains the images.")
    # Add an optional argument, representing the extension list, defaulting to png, jpg, jpeg, and gif
    parser.add_argument("-e", "--extensions", nargs="+", default=[".png", ".jpg", ".jpeg", ".gif"], help="The extensions of the image files.")
    # Parse command-line arguments and return the result object
    args = parser.parse_args()
    return args.folder_path, args.extensions

# Define a function to print the comparison of file sizes before and after optimization
def print_result(result):
    # If the result is not empty, indicating successful conversion and saving
    if result:
        # Unpack the result into a tuple of filename and file size tuple
        if len(result) == 3:
            file, old_size, new_size = result
            # Display the comparison of file sizes before and after optimization in the console
            logging.info(f"{file}: {old_size} bytes -> {new_size} bytes")
        else:
            # Log the result if it's not a tuple of 3 elements
            logging.info(f"{result}")

# Configure the logger, outputting logs to the console and files, setting the log level to INFO
logging.basicConfig(level=logging.INFO,
                    format="%(asctime)s %(levelname)s %(message)s",
                    handlers=[logging.StreamHandler(),
                              logging.FileHandler("log.txt")])

# Call the function to get the folder path and extension list
folder_path, extensions = parse_args()

if __name__ == "__main__":
    # Windows needs this function because Windows lacks the fork() function (not entirely accurate).
    # Therefore, on Windows, forking is simulated by creating a new process, and the code - Additionally, this new process is instructed to run the code passed through a pipe by passing the `--multiprocessing-fork` command-line argument to it.
- If you examine the implementation of the `freeze_support()` function, its task is to check which process it's running in and whether it should execute the code passed through a pipe.
- `multiprocessing.freeze_support()`

- A process pool is created, automatically allocating processes based on the number of cores on the computer.
- An empty list, `results`, is created to store the result objects of asynchronous tasks.
- The script iterates through all files in the folder:
- The full file path is constructed using `os.path.join()`.
- It checks if the file ends with any of the extensions specified in the `extensions` list (you can modify this list as needed).
- If it's an image file, the `convert_and_save` function is called to convert and save the image asynchronously, without blocking the main process. The file size is also obtained.  The result object is then added to the `results` list using `pool.apply_async()` with the callback function `print_result`.
- The process pool is closed to stop accepting new tasks.
- `pool.join()` is called to wait for all tasks in the pool to complete.

## Epilogue
Due to local development being on a `windows` system, the first answer given by `AI` did not include the `main` function and also lacked `multiprocessing.freeze_support`. The code was fixed after following up and encountering an error. 
Just as learning the techniques of search engines required skill, we also need to learn how to communicate with `AI`, providing reasonable and sufficient constraints to efficiently obtain the desired answers.
Note: **If you are a programming beginner, if you still don't understand certain parts of the code based on the given comments, please continue to ask related questions.**
Licensed under CC BY-NC-SA 4.0
Last updated on Jun 02, 2025 20:54
A financial IT programmer's tinkering and daily life musings
Built with Hugo
Theme Stack designed by Jimmy