AI-assisted programming, the evolution of productivity

“GitHub Copilot” was released less than two years ago, and “ChatGPT” has appeared. I don’t fully understand the underlying principles; I’ve been using them for a while. The two tools’ levels of assistance are completely different, but they have both achieved a significant increase in productivity.

For things that are too complex, AI still can’t do it, because they lack logic, or they rely on patterns or fixed frameworks – the training data is sufficient, and AI’s effect is nine out of ten.

GitHub Copilot

When publishing, I looked at the official introduction’s “demo” and found it wasn’t very smart. After trying it out, it wasn’t great, so I gave up on it.

Before 2022, when I had some free time, I switched to the new version to try it out, and the effect was already quite good. The only issue was that the speed in China was a bit slow, likely due to network reasons. For languages like Python, the code generated based on comments was pretty good, and much faster than researching and writing code myself.

ChatGPT

Github Copilot still focuses on developers, i.e., from a development perspective when designing the product, requiring you to understand how to program and guide AI assistance through comments. ChatGPT has moved to the next level, allowing non-experts to quickly get started. A conversational chat interface where you can pose questions, describe requirements, and ask for clarification if needed.

Let’s take an example: You know about tokenization but don’t know how to code, wanting to calculate the similarity of strings:

Python 分词计算 維天運通 LOGORY 和 合肥維天運通信息科技股份有限公司 两个字符串的相似度；
比较前，统一处理字符串，英文转换为小写；
第二个字符中如果能找到第一个字符的首字符，将第二个字符串截位，截位的长度为第一个字符串的长度

Hello, this is Bing. 😊

According to online information 123, Python tokenization can be implemented using the jieba library. The jieba library supports three segmentation modes: precise mode, full mode, and search engine mode. You can choose an appropriate mode based on your needs.

To compare two strings, you first need to tokenize them and then calculate their word vectors or edit distance metrics. You can use libraries like NumPy or SciPy for mathematical operations.

To standardize string processing, converting English to lowercase, you can use Python’s built-in lower() function.

To slice the second string, you can use Python’s slicing operator [].

Here is a possible code example:

import jieba
import numpy as np
from scipy.spatial.distance import cosine

# Define two strings
str1 = "維天運通 LOGORY"
str2 = "合肥維天運通信息科技股份有限公司"

# Standardize the string, convert English to lowercase
str1 = str1.lower()
str2 = str2.lower()

# Find the position of the first character of the first string in the second string
index = str2.find(str1[0])

# If found, slice the second string
if index != -1:
    str2 = str2[index:index+len(str1)]

# Tokenize the two strings using the jieba library
words1 = list(jieba.cut(str1))
words2 = list(jieba.cut(str2))

# Calculate the word vectors of the two strings (using frequency as weights)
vector1 = np.zeros(len(words1))
vector2 = np.zeros(len(words1))

for i, word in enumerate(words1):
    vector1[i] += 1 # Count the number of times each word appears in the first string
    if word in words2:
        vector2[i] += 1 # Count the number of times each word appears in the second string

# Calculate the similarity between the two strings (using cosine distance as an index)
similarity = 1 - cosine(vector1, vector2)

print("The similarity between the two strings is:", similarity)

Furthermore, consult the results of running and congratulations, you have discovered a problem. The result is fabricated, it does not actually run the code, but will give you a good result.

Example

Question: SQL statement to retrieve records from the product code table where the product code is within the range 80000-89999.

Hello, this is Bing. 😊

According to search result 12, you can use the SQL BETWEEN operator to select values within a given range. The BETWEEN operator includes both the start and end values.

You need to first specify the table name and column name that you want to query, and then use the BETWEEN operator and the range value in the WHERE clause.

For example, if you want to retrieve records from the product code table where the product code is between 80000 and 89999, you can use the following statement:

SELECT * FROM product_code
WHERE code BETWEEN 80000 AND 89999;
Here, '*' represents selecting all columns [3]. If you only want to select specific columns, such as name and price, you can separate them with commas, as shown below:

SELECT name, price FROM product_code
WHERE code BETWEEN 80000 AND 89999;

I hope this helps.

Continue questioning, negating it: The product code is a string, your code is wrong

Sorry, I didn't notice that the product code is a string. If that’s the case, you need to add single quotes around both ends of the range values, as follows:

SELECT * FROM product_code
WHERE code BETWEEN '80000' AND '89999';

Please forgive my mistake. 🙏

Congratulations, it will recognize mistakes.