Answer the question
In order to leave comments, you need to log in
Escaping input
Howe. Tell me, habravchane, I'm writing a project, are there fields where the user can enter, say, add news to the site, what characters to be entered should be paid attention to in order to escape or remove them to prevent injections? I googled this topic, can you advise another che-thread?
PS: I
'll reformulate the question a little - the user adds news by inserting tags, the task is to filter the input so that the output is text with html tags, but filtered from the injection code
Answer the question
In order to leave comments, you need to log in
Django html sanitizer
You set the tags that are allowed, the rest will be removed. For example, as in Habré.
Or write in Google python white list html sanitizer - if you don’t find a library that satisfies all the requirements, then you will definitely understand in which direction to dig.
Or take any python html parser and write your bike, just walk through the tree and remove tags that are not in the allowed list. So you can make custom tags as in Habré, for example <slideshow>
. For example, if it is possible to insert pieces of code, then everything needs to be escaped into the inside of the tags <code>
so that if you insert a piece of html or js in an article or comments, it will not be reproduced in the browser.
In general, there are many nuances, depending on what is required.
I am writing from a tablet, there are a lot of errors, I apologize.
No, it's better not to be afraid and use ORM.
And then it will not give a 100% guarantee of safety. For example, if you have a drop-down menu to select sorting by a specific field, which you accidentally pass to the database in the ORDER BY parameter.
You don't need to cut anything. When inserting into the database, it goes without saying that quotes should be escaped (or rather, use PDO, placeholders, ORM, or something similar). And when displaying on the page, use htmlspecialchars, if we are talking about php.
If you are using django then you don't need to worry about this. The ORM will take care of escaping the quotes. And in templates, all output strings are escaped by default to protect against xss.
It's just that not only the tags need to be cleaned there, but also the attributes of the tags, so that for example they could not put some kind of css style with capital letters on the text, etc.
By the way, recently there was an article on Habré where they ported the habr parser to python.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question