ืืฆืืื! ืื ื ืืืจืื ืขืืจื, ืืคืืจื ืฉืื ืืืจ ืฉืื, ืืื ืืืช ืืื ืืืืจื ืฉืืืื ืชืื ืืื ืฉืืื ืืืืืข ืื ืืืคืฉ ืชืฉืืืืช ืืืืืงืืคืืื โ ืืื ืืฉืคืช Python, ืขื ืืชืืืืช ืืืืืช ืืขืืจืืช:
- ืฉืื 1: ืืชืงื ืช ืกืคืจืืืช
- ืฉืื 2: ืืืืื ืกืคืจืืืช, ืืืืจืช ืืืื, ืืืืืจืช ืฉืคืช ืืืงืืคืืื
- ืฉืื 3: ื ืืงืื ืืงืกื ืืืืฉืชืืฉืื
- ืฉืื 4: ืงืืืฅ ืืืืืืืืื โ ืืืกืืก ืืืืืืช ืืืื
- ืฉืื 5: ืงืจืืืช ืืงืืืฅ ืืืื ืช ืืืื ืืืืืื
- ืฉืื 6: ืืคืงืช ืชืฉืืื ืืืืืื
- ืฉืื 7: ืืืคืืฉ ืชืฉืืื ืืืืืงืืคืืื
- ืฉืื 8: ืืขื ื ืืืฉืชืืฉ ืืืืืจื
- ืฉืื 9: ืขืืืื ืืชืฉืืื ืืื ื ืืื ื
- ืฉืื 10: ืืคืขืืช ืืืื
ืฉืื 1: ืืชืงื ืช ืกืคืจืืืช
ืืคื ื ืฉื ืชืืื ืืชืื ืช, ื ืฆืืจื ืืืชืงืื ืฉืืืฉ ืกืคืจืืืช:
pyTelegramBotAPI
โ ืืฉืืืืช ืืงืืืช ืืืืขืืช ืืืืืจืwikipedia
โ ืืื ืืืคืฉ ืชืฉืืืืช ืืืืืงืืคืืืscikit-learn
โ ืืื ืืืื ืืช ืืืื ืืขื ืืช ืขื ืฉืืืืช (ืืืืืช ืืืื ื ืืกืืกืืช)
ืคืชืื ืืช ืืืจืืื ื ืืืงืืืื:
pip install pyTelegramBotAPI wikipedia scikit-learn
ืฉืื 2: ืืืืื ืกืคืจืืืช, ืืืืจืช ืืืื, ืืืืืจืช ืฉืคืช ืืืงืืคืืื
import telebot, wikipedia, re
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
- ืืื ืื ืื ื ืืืืืืื ืืช ืื ืืกืคืจืืืช.
- ื ืืืืจ ืืช ืฉืคืช ืืืงืืคืืื ืืขืืจืืช:
wikipedia.set_lang("he")
- ื ืืฆืืจ ืืช ืืืื ืขื ืืืืงื ืฉืงืืืืชื ืึพBotFather:
bot = telebot.TeleBot('ืืื ืก_ืืื_ืืช_ืืืืงื_ืฉืื')
ืฉืื 3: ื ืืงืื ืืงืกื ืืืืฉืชืืฉืื
ืืื ืฉืืืืื ืืืื ืืืืื ืืช ืืงืื ืืืืฉืชืืฉ, ื ื ืงื ืืื ื ืกืืื ืื ืืืืชืจืื. ื ืืืืจ ืคืื ืงืฆืื ืฉืชื ืงื ืืช ืืืงืกื, ืื ืฉืชืืฉ ืืืืชืืืช ืขืืจืืืช ืืืื.
def clean_str(r):
r = r.lower()
r = [c for c in r if c in alphabet]
return ''.join(r)
# ืชืืืื ืืืชืจืื ืืขืืจืืช:
alphabet = ' ืืืืืืืืืืืืืื ืกืขืคืฆืงืจืฉืช0123456789?%.,()!:;'
ืฉืื 4: ืงืืืฅ ืืืืืืืืื โ ืืืกืืก ืืืืืืช ืืืื
ืฆืจื ืงืืืฅ ืืฉื dialogues.txt
ืืืืชื ืชืืงืืื ืฉืื ื ืืฆื ืืงืื ืฉืืื. ืื ืฉืืจื ืืงืืืฅ ืชืืื ืืื ืฉื ืฉืืื ืืชืฉืืื, ืืืืืื:
ืฉืืื\ืฉืืื ืืืจืื!
ืื ืฉืืืื\ืื ื ืืจืืืฉ ืืฆืืื!
ืื ืืชื\ืื ื ืืืื ืฉืื.
ืฉืืื ืื: ืืฉืืื ืืคื ื \
, ืืืชืฉืืื ืืืจื.
ืฉืื 5: ืงืจืืืช ืืงืืืฅ ืืืื ืช ืืืื ืืืืืื
def update():
with open('dialogues.txt', encoding='utf-8') as f:
content = f.read()
blocks = content.split('\n')
dataset = []
for block in blocks:
replicas = block.split('\\')[:2]
if len(replicas) == 2:
pair = [clean_str(replicas[0]), clean_str(replicas[1])]
if pair[0] and pair[1]:
dataset.append(pair)
X_text = []
y = []
for question, answer in dataset[:10000]:
X_text.append(question)
y.append(answer)
global vectorizer
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(X_text)
global clf
clf = LogisticRegression()
clf.fit(X, y)
update()
๐ ืื ืงืืจื ืืื?
- ืืืขื ืื ืืช ืื ืืฉืืืืช ืืืชืฉืืืืช ืืืงืืืฅ.
- ืื ืงืื ืืช ืืฉืืืืช ืืืชืฉืืืืช.
- ืืืคืืื ืืืชื ื"ืืงืืืจืื" ืืขืืจืช
CountVectorizer
. - ืืืื ืื ืืืื
LogisticRegression
ืืื ืฉืืืื ืื ืืฉ ืืช ืืชืฉืืื ืืื ืืชืืืื ืืฉืืื ืืืฉื.
ืฉืื 6: ืืคืงืช ืชืฉืืื ืืืืืื
def get_generative_replica(text):
text_vector = vectorizer.transform([text]).toarray()[0]
answer = clf.predict([text_vector])[0]
return answer
๐ ืืื ืื ืื ื ืืงืืืื ืฉืืื ืืืืฉืชืืฉ ืืืืืืจืื ืืช ืืชืฉืืื ืืื ืืชืืืื ืืชืื ืื ืฉืืืื ื.
ืฉืื 7: ืืืคืืฉ ืชืฉืืื ืืืืืงืืคืืื
def getwiki(s):
try:
ny = wikipedia.page(s)
wikitext = ny.content[:1000]
wikimas = wikitext.split('.')
wikimas = wikimas[:-1]
wikitext2 = ''
for x in wikimas:
if not('==' in x):
if len(x.strip()) > 3:
wikitext2 += x + '.'
else:
break
wikitext2 = re.sub('\([^()]*\)', '', wikitext2)
wikitext2 = re.sub('\{[^\{\}]*\}', '', wikitext2)
return wikitext2
except Exception:
return 'ืื ืืฆืืชื ืืืืข ืื ืืฉื ืืื ืืืืืงืืคืืื.'
๐ ืืคืื ืงืฆืื ืืื:
- ืืืคืฉืช ืขืจื ืืืืืงืืคืืื ืืคื ืฉืืืืชืช ืืืฉืชืืฉ.
- ืืืืืจื ืืช 1000 ืืชืืืื ืืจืืฉืื ืื ืืขืจื.
- ืื ืงื ืกืืืจืืื ืืืขืจืืช ืฉืืืืื.
ืฉืื 8: ืืขื ื ืืืฉืชืืฉ ืืืืืจื
@bot.message_handler(commands=['start'])
def start_message(message):
bot.send_message(message.chat.id, "ืฉืืื ืืืื ื, ืืื ืืคืฉืจ ืืขืืืจ?")
question = ""
@bot.message_handler(content_types=['text'])
def get_text_messages(message):
command = message.text.lower()
if command == "ืื ื ืืื":
bot.send_message(message.from_user.id, "ืื ืื ืืชืฉืืื ืื ืืื ื?")
bot.register_next_step_handler(message, wrong)
else:
global question
question = command
reply = get_generative_replica(command)
if reply == "ืืืงื ":
bot.send_message(message.from_user.id, getwiki(command))
else:
bot.send_message(message.from_user.id, reply)
๐ฏ ืื ืืืฉืชืืฉ ืืืชื "ืื ื ืืื", ืืืื ืฉืืื "ืื ืื ืืชืฉืืื?", ืืืืกืืฃ ืืืชื ืืงืืืฅ.
ืฉืื 9: ืขืืืื ืืชืฉืืื ืืื ื ืืื ื
def wrong(message):
a = f"{question}\\{message.text.lower()}\n"
with open('dialogues.txt', "a", encoding='utf-8') as f:
f.write(a)
bot.send_message(message.from_user.id, "ืขืืืื!")
update()
๐พ ืืื ืื ืื ื ืืขืืื ืื ืืช ืืงืืืฅ ืืืืื, ืืืจืขื ื ืื ืืช ืืืืื โ ืืื ืฉืืืื ืืืืช "ืืืื" ืืืื ืืืช.
ืฉืื 10: ืืคืขืืช ืืืื
bot.polling(none_stop=True)
ืืืื! ืืืื ืืืื ืืคืขืืื.