F
F
fortran2017-06-12 22:06:51
bash
fortran, 2017-06-12 22:06:51

Curl replaces all "+" characters with spaces in the text of variables when passed by the POST method. What could be the reason?

So, I decided to make a small content aggregator with the output of a beautiful graph on svg, but that's not the point.
The site from which I want to take content uses protection in the form of generating keys in hide fields, I almost bypassed it. Checking requests in wireshark. Screen one:
83867359185a42d0be934f3024d7b6fd.png
As you can see, variables with spaces go to the server, but in curl I send text with "+" characters.
e1c339fafc2f4e89a783b6a4c0150fea.png
Actually, the question boils down to why curl replaces all "+" characters with spaces and how to get rid of it?

#!/bin/bash
curl   --dump-header /home/forttran/work/cookie1.txt `
`     -H "User-Agent: w3m/0.5.3+debian-15" `
`     -H "Accept: text/html, text/*;q=0.5, image/*, application/*, video/*, audio/*, message/*, x-content/*, inode/*, x-scheme-handler/*, misc/*"`
`     -H "Accept-Language: ru;q=1.0,en;q=0.5"`
`     -H "Host: www.moex.com"`
` http://www.moex.com/ru/derivatives/open-positions.aspx >index.html;
#echo $headers_and_cookies
viewstate=$(cat index.html|sed -n 's/.*<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="\(.*\)" \/>.*/\1/p')
echo $viewstate
viewstategenerator=$(cat index.html|sed -n 's/.*<input type="hidden" name="__VIEWSTATEGENERATOR" id="__VIEWSTATEGENERATOR" value="\(.*\)" \/>.*/\1/p')
#echo $viewstategenerator
eventvalidation=$(cat index.html|sed -n 's/.*<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="\(.*\)" \/>.*/\1/p')
#echo $eventvalidation
curl `
`   --dump-header /home/forttran/work/cookie2.txt ` 
` 	--trace-ascii /home/forttran/work/trace.txt `
`	  -d "__VIEWSTATE=$viewstate&__VIEWSTATEGENERATOR=$viewstategenerator&__EVENTVALIDATION=$eventvalidation&ctl00\$PageContent\$frmInstrumList=ALRS_F&ctl00\$PageContent\$frmDateTime\$CDateDay=12&ctl00\$PageContent\$frmDateTime\$CDateMonth=6&ctl00\$PageContent\$frmDateTime\$CDateYear=2017&ctl00\$PageContent\$frmButtom=Показать"`  
`   -H "User-Agent: w3m/0.5.3+debian-15" `
`   -H "Accept: text/html, text/*;q=0.5, image/*, application/*, video/*, audio/*, message/*, x-content/*, inode/*, x-scheme-handler/*, misc/*"`
`   -H "Accept-Encoding: gzip, compress, bzip, bzip2, deflate" `
`   -H "Accept-Language: ru;q=1.0,en;q=0.5"`
`   -H "Host: www.moex.com" `
`   -H "Referer: http://www.moex.com/ru/derivatives/open-positions.aspx" `
`   -H "Content-type: application/x-www-form-urlencoded" `
`	http://www.moex.com/ru/derivatives/open-positions.aspx >result.html

Curiously, the w3m browser with identical headers rips out what I need.
Actually, this once again proves that the problem lies precisely in the parsing of the "+" character.
Thank you all. Already sorted out. It was necessary to go through the gray in the bash script again and replace all + with% 2B

Answer the question

In order to leave comments, you need to log in

1 answer(s)
T
toxa82, 2017-06-22
@fortran

You can try flagging --data-urlencode "text=${message}"

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question