P
P
pynew_user2021-11-05 14:57:01
Python
pynew_user, 2021-11-05 14:57:01

The code does not correctly read the words from the file. Why and what is wrong?

There is a task:
Write a program that reads text from a file (there can be more than one line in the file) and displays the most frequent word in this text and, separated by a space, how many times it occurred. If there are several such words, output the first one lexicographically (you can use the < operator for strings).

Give the output of the program as your answer, not the program itself.

Words written in different registers are considered the same.

Sample Input:

abc a bCd bC AbC BC BCD bcd ABC
Sample Output:

abc 3
The input is a file (I can't link to the question). Here is the text of the file:

spoiler
File text
ZaYYYT Xbac Z ZdbTXbY badTdXp Uaddbap UU Z pUTadZd Uaddbap ZdbTXbY ZdbTXbY bTdUbd c XY XUTTTZU d pa bUUYa Z bddpXTX bXTUYaX ZdbTXbY ZdbTXbY ZTX ZaZUZT YXYZa bY bUUYa c TcZUU
ddTbZXXb ddTbZXXb ddTbZXXb ddTbZXXb UccUZp XUppdTXd b ddTbZXXb ddTbZXXb ZTpccTXb ZTpccTXb ddTbZXXb pTaUbU ab YaX ddTbZXXb ZcTcaTUZ
cXcZXc TdTUXaTa ccbcYdUcY bpdcYXcZ ZpaZcXd b pUcYYa bpdcYXcZ pUcYYa cXcZXc pXcT bpdcYXcZ adZppYaY bUdap pacddb dTbbZ ZUXd pUcYYa pXcT YaYcUYZdd dY YbddUpcTZ dUdTXda aUU ppYbTcddp aUbYbYUZ UbbYa aUbYbYUZ TpcXaZaZ aUbYbYUZ UXUadaXc UZdda XTpT bdXddbZY aUbYbYUZ UXUadaXc YTZYaTXXZ pUbU U cTbp dUZ apaXcaZdd TppppT TpbbpU TbZXU Taad ZdcUp p bpTTppUY aUbYbYUZ XY ZZpcdp cpTXTUaZY
cbcpaZYba YpXcd d apUZ YacXTpa d apUZ Yc apUZ d cTppTZ apUZ TZcYXXZcU pZd ZXZYYZ XcT apUZ cccaYZb ddZpT aTTZYc dYc pbcZYdZaa dTd UYZU a pbcZYdZaa d UaZ apUZ Zcb UYZU cbTdUbaYb YU apUZ aYTccpb X
adadpUpY UX UX aZTT aZTT cUdpbc cZbZbX UX dYd YY pX Tac cb UX baYU TdYdpd X UX ccZUZXY YcpXbdU dZdZpTT cTY XZUXpUZ UZYUT ppdZYdU YcpXbdU cd b bapaUX YXTpdTa paUY pZaYX aZTT TdcZXT ZYpX TdcZXT TYdY c aZTT acXYUbbTb dZdZpTT pabb pcUTbcdpZ bXZU apXZadY ZcYbdZpa paUY
XYY XYY XYY Y XYY UbpabT c aX XTd U TpTbaZa Y pXXYbcpZY pXXYbcpZY cTY Z ppbYb pddZU pXXYbcpZY XYY ZXX UZZ YpbZY TXbTXdTb bdYc pZ UXUbX UUcYpXY UdYZ bZpUZXbaX T YapXX TapZY UTcUaccT Y UTcUaccT pXXYbcpZY UbbcXUcc bdYc pZ apdpbY ZUbXYUTX UZZ bY TppadYYp bZaUpa bbYUUTT aXdaTabT XYY daUTUad Zc
da ddZ cca aXpaY aXpaY bZZ bcd da ddZ ddZ bcd dcpYUT pY bcd acTYYUUZ da YTY bcd aXpaY ZapbYZYc ZXXa ddZ cXcppba abZda X acTYYUUZ bcd dacTUaY pUdbUUbb UXdXbcX pY aXa Ua T UZYUp T YUZaZ TaTYYcZ bZYpp ppadcYaba ppY ddZ
aZ XaUc ZZapX ZZapX aXXXYT bdTdUU ZZapX aZ Tabpb aUYUZpUY cZab UZ ZZapX UZ ccaUZa aaTTdTZ YUUa aZpYY ZYYZY aXXXYT U YZTY dU TdaYXUcc dYbZ dU TTdYU ZdcXUp
YcU aTcU UaZ XYYXaaUb XYYXaaUb XYYXaaUb XYYXaaUb YcU YU XYYXaaUb YcU bccd dddaadaY XZTdY XZTdY a dp TYXTp bdY UaZ pXaYY YcU TZ bdY YcU XaUUpaU pcYYXXZd YcU XZTdY ddZ cbUUY dUaZXc XXbT bUUZcT XYYXaaUb XTZZpYpZX TZaYc dYZ bdY cUbUZYap UcpccXd XadZpUY cUTZpZ UaZ XYYXaaUb aabXbdcb
YXZXbbcZb YdXUU dUcd UpZbTX adZdUTp adZdUTp YdXUU UaYYYcZ YXZXbbcZb p dUcd YdXUU adZdUTp ZppbUY ba UTZdZdY ab ZppbUY dd ba ZUTTaXbpb UpZbTX UTZdZdY pX aaUZXZ ab dUcd pppTTXUZp adZdUTp YTpa Zbpc ZppbUY UpZbTX db ab X dUcd ZXpXUZ dUcd dcTaXZaX ba d Ybc ZppbUY X pdcYbYdZ
bZZb dZpUUUX pcT cTb dZpUUUX ab bZZb bZZb ZTUbZU dZpUUUX ab ZdZTpb ab cX caYdYdaZa ab ab pT bZZb ZUUXXadaa TaUbdZa bpdc dbZbbXpcd abYpdTada XYcTcZd bZZb Upa b aXcZbXUXb XbU XbU UUpYbpda
Tcb dd Tcb a pdcXc Tcb Ub Tcb X XXpZcdUYX appTdZ UXZ XXpZcdUYX UXZ Ub ZZadTbbY ad Tcb XaacZUad TbpddZ a XccbZpbaX acUT aXY cpUd ZYcYdZdUp ZU dZ ZXbdZbdZU U XTdbUTU UcadYc Tcb ZZadTbbY appTdZ UXZ XZaad Tcb XdYd Tcb YUTYUpZ acaZaTba b YUTYUpZ dd
TppcYcpcb c c dYapd U Y ZpapbU dT c addbYTTdT cYYcXpYd bb a
paZbabXZ paZbabXZ abpUddd paZbabXZ U pZaTTb ZZdbdccb bXcXZpYb UXTpXTpTY c d XcacbbU ZZ bbUc XXcZYZZXb TpcZaZp c XpTbcdbc cTTZX YX TpcZaZp bp XcacbbU ZZ dapYa TabTacZ cUTYT ZZ pXc bpYpdcc ccYpTpUZZ abpUddd aUXXTd XcacbbU XcacbbU TpcZaZp ZTY ZdpXX TpcZaZp ppZZ aUXXTd Z dU YX
dcYbYZYdX TZaadbTTZ Y ZZZ bTUaUbd bdbX caZZbdaT bUTTpbccc XpaXYTcp ZZZ XccTUUY bbpcbpZc dXddbTa aTZppT TZaadbTTZ ZZZ dXddbTa UacTpTbb TabaZ TZaadbTTZ cZXdadU YddTcUpdd XdcXb ZZZ bdbX dY XZT bbpcbpZc YddTcUpdd YddTcUpdd Y paU pdab UpYddU bcdYaU bdbX aXUpaUd b TZaadbTTZ bcdYaU XccTUUY Yaabc bcdYaU YacpX ZZXTZ ZUdYUTc bbpcbpZc d cTd aYUUd ZYppUdUZX dY pXpZd
cbYbdd cbYbdd dY bTXYpcbb YdbYTUpZ U pU YdbYTUpZ aXppaXb YdbYTUpZ U cbYbdd bYZTa bYZTa U cbYbdd cbYbdd cbYbdd ZZpb dpapX U XT U XT XpdaYXX aTdpYZZb bYdZ XT XdTUd bcdUYTpT ddYdXd bpZ YYUaYUdT cbYbdd p
bYYTUpaTd pcp aUbZ YZUbUdZZ pcp aUbZ ZpXpTbdUd ZpXpTbdUd XTT ZpXpTbdUd dbdbUY YpYcTdTU ZpXpTbdUd dXYpXaT YcaTX cTY pcp b ppUYcZYUY b X aXZpaYX YZUbUdZZ UYUcYaZdT XYXcXb Za adXpd UYUcYaZdT Za UXadXaad UbbaaZZUX pYYZdpYa Z ZpUXXbp UYUdYpdT pad ccbbcUaZ TUTbUXcdb cZbbp ZbaX
Z UUXTa cXTXZbUba TXXXbUbY Z p TXXXbUbY badX UUXTa Z TXXXbUbY X pXabXbYaU p
UZ YTbapTb YTbapTb ZYbdTd UZ UZ YYY ppdbpaZ acYUd pXTZYYc ZUYdTY c XTTbT ccTYZ pXTZYYc bUaU YTbapTb ZZbTXcUad dacYd T ccXTX pXTZYYc pdY bZbUdZbXp cb pbpaTTb XTTbT p ddaXYbUU UaZYpdY


According to the condition, I wrote the code, but for some reason it does not correctly calculate the most popular word. My output: xttbt 2
My code:

slov={}
with open('C:\\Users\\telel\\Downloads\\dataset_3363_3.txt') as inp:
    for line in inp:
        line=line.lower()
        for i in line.split():
            if i not in slov:
                slov[i]=1
            elif i in slov:
                slov[i]+=1
max_value=1
for key,value in slov.items():
            primary=slov[key]
            if primary>max_value:
                max_key=primary
                max_slov=key
with open('C:\\Users\\telel\\Downloads\\dataset_3363_3.txt', 'w') as out:
    popular=(max_slov + ' ' + str(max_key))
    out.write(popular)


In theory, it should record the most popular value and add it to the key. I also cannot implement the condition "If there are several such words, display the first one lexicographically (you can use the < operator for strings)." Please provide clarification.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Viktor, 2021-11-05
@MadLor

Here

if primary>max_value:
                max_key=primary
                max_slov=key

you need to add max_value = primary i.e.
if primary > max_value:
        max_key = primary
        max_slov = key
        max_value = primary

you have changed the "current" maximum value, but you did not take this into account ... This is on the first question.
About
I also cannot implement the condition "If there are several such words, display the first one lexicographically (you can use the < operator for strings)." Please provide clarification.

If I were you, I would simply sort the dictionary in ascending (descending) values ​​and compare the values ​​​​on the last (first) two (or more) keys and draw conclusions from this - one word occurs the maximum number of times or several words.
PS Using the < and > operators, you can compare not only numeric values, but also strings.

T
Ternick, 2021-11-08
@Ternick

My version:

CODE

def main():
  list_from_line = []
  with open("input.txt") as f:
    for line in f:
      list_from_line += line.lower().strip().split()

  unique_keys = set(list_from_line)

  top = {
    key: list_from_line.count(key) for key in unique_keys
  }

  max_count = max(top.values())

  top_items = [item for item in top.items() if item[1] == max_count]

  result = min(top_items, key = lambda x: x[0])

  print(f"{result[0]} {result[1]}")

if __name__ == "__main__":
  main()


I do not pretend to be understandable)
Since I usually write code trying to use ready-made functions.
I think it works.
The result of the first test:
abc 3
The result of the second test:
u 11
If something is not clear, I can explain, but I still recommend that you figure it out yourself)
It is better to take courses on your own, this is not the most difficult task that I have seen)
What will happen next? When will the tasks get harder?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question