V
V
Victor Berdyansky2016-12-13 01:00:28
PHP
Victor Berdyansky, 2016-12-13 01:00:28

How to cut text while maintaining the integrity of the html code?

Good night
I have a task to implement the addition of an article on the site through a WYSIWYG editor (html editor).
If you need to cut off such text, there is a big risk of violating the integrity of the html code.
I have 2 ideas how to solve this problem.
1) When outputting, check the text truncations for correctness, in my case, run through the library (HTMLPurifier for Laravel 5) to clean up and "align" the bad html code.
2) When adding an article, cut the text and go through paragraph 1 and then save it in the database, in the column for the summary of the article.
Option 1 is just terrible in my opinion, since we will have to call the clear function every time we try to display the article.
I would like you to point me in the right direction in solving this problem.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
X
xmoonlight, 2016-12-13
@xmoonlight

Is it really so difficult to split the finished article content with tags into 2 parts?
1. We count the required number of characters of pure text, ignoring the tags and remembering the open ones on the stack (closed ones - we extract from the stack).
2. Cut
3. Close the tags from the stack.
4. save to the cache (to a column in the database)
I think that it would be better to save in 2 columns here: [number of characters] and [short finished content].
As the number of characters for the preview is changed in the admin panel, the cache will be updated according to the [number of characters] column.

T
ThunderCat, 2016-12-13
@ThunderCat

xmoonlight , not, in general, it’s understandable, the tag opened, a piece of text inside the tag went, we shrugged it off - accordingly, the closing tag was screwed up, the whole html collapsed. This is understandable, it is not clear why the text with tags is on the preview, it is easier to drop tags and display stupid text. Well, or as they wrote above - there is a mountain of ready-made solutions, the question of storing these pre-cut pieces in the database is a question of the ratio of greed, which is not a pity - database resources or memory / processor resources.

E
Eugene Volf, 2016-12-13
@Wolfnsex

Option 1 is just terrible in my opinion, since we will have to call the clear function every time we try to display the article.

What for? It is possible to store in the database already cut text in the database, in an additional field (if the standard one does not suit you).
*I don't understand why everyone is so afraid to add some "extra" (additional) data to the database... Do you need to store gigabytes of text there, and the database runs on a processor from a calculator?
It's not entirely clear what "risk to violate the integrity of the code" means? Do you need to cut the text out of the code? To do this, there are other tools that work with DOM markup, they allow you to change HTML elements and their contents without violating the integrity, some phpQuery for example (and also Symfony Dom Crawler, Simple HTML DOM, etc.), if you really want to fiddle with HTML code...
PS For HTML editors, it often happens that either the user inserted some nonsense into the code, or the editor himself did a good job of spoiling the code ... Therefore, I still would not dismiss the technology "Storing normalized data together with the original "(which I wrote about above). It would be more correct to say, "Storing the original data next to the normalized ones", if for some reason they are still needed ...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question