Safe truncate string contains color tag
Posted By: Anonymous
I have a string which contains color tags.
var myString = "My name is <color=#FF00EE>ABCDE</color> and I love <color=#FFEE00>music</color>";
My string becomes "My name is ABCDE*(pink)* and I love music*(yellow)*"
I want to truncate if the string reaches max length but still keep color tag
var myTruncateString = "My name is <color=#FF00EE>ABCDE</color> and I love <color=#FFEE00>mu</color>";
My string becomes "My name is ABCDE*(pink)* and I love mu*(yellow)*"
Do you have any suggestion?
var stringWithoutFormat = String.Copy(myString);
stringWithoutFormat = Regex.Replace(stringWithoutFormat, "<color.*?>|</color>", "");
var maxLength = 20;
if (stringWithoutFormat.Length > maxLength)
{
// What should I do next?
}
Solution
Here’s a relatively simply and NOT error-handling example of what I think you’re trying to accomplish:
- Don’t count color tags when checking maximum length
- Remove characters from the end, don’t destroy color tags
- If you end up with color tags with no text between them, remove those tags
Note: This code is not thoroughly tested. Feel free to use it for whatever you want, but I would write a lot of unit-tests here. In particular I’m scared about the existance of edge-cases that lead to an infinite loop.
public static string Shorten(string input, int requiredLength)
{
var tokens = Tokenize(input).ToList();
int current = tokens.Count - 1;
// assumption: color tags doesn't contribute to *visible* length
var totalLength = tokens.Where(t => t.Length == 1).Count();
while (totalLength > requiredLength && current >= 0)
{
// infinite-loop detection
if (lastCurrent == current && lastTotalLength == totalLength)
throw new InvalidOperationException("Infinite loop detected");
lastCurrent = current;
lastTotalLength = totalLength;
if (tokens[current].Length > 1)
{
if (current == 0)
return "";
if (tokens[current].StartsWith("</") && tokens[current - 1].StartsWith("<c"))
{
// Remove a <color></color> pair with no text between
tokens.RemoveAt(current);
tokens.RemoveAt(current - 1);
current -= 2;
// Since color tags doesn't contribute to length, don't adjust totalLength
continue;
}
// Remove one character from inside the color tags
tokens.RemoveAt(current - 1);
current--;
totalLength--;
}
else
{
// Remove last character from string
tokens.RemoveAt(current);
current--;
totalLength--;
}
}
// If we're now at the right length, but the last two tokens are <color></color>, remove them
if (tokens.Count >= 2 && tokens.Last().StartsWith("</") && tokens[tokens.Count - 2].StartsWith("<c"))
{
tokens.RemoveAt(tokens.Count - 1);
tokens.RemoveAt(tokens.Count - 1);
}
return string.Join("", tokens);
}
public static IEnumerable<string> Tokenize(string input)
{
int index = 0;
while (index < input.Length)
{
if (input[index] == '<')
{
int endIndex = index;
while (endIndex < input.Length && input[endIndex] != '>')
endIndex++;
if (endIndex < input.Length)
endIndex++;
yield return input.Substring(index, endIndex - index);
index = endIndex;
}
else
{
yield return input.Substring(index, 1);
index++;
}
}
}
Example code:
var myString = "My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>music</color>";
for (int length = 1; length < 100; length++)
Console.WriteLine($"{length}: {Shorten(myString, length)}");
Output:
1: M
2: My
3: My
4: My n
5: My na
6: My nam
7: My name
8: My name
9: My name i
10: My name is
11: My name is
12: My name is <color=#ff00ee>A</color>
13: My name is <color=#ff00ee>AB</color>
14: My name is <color=#ff00ee>ABC</color>
15: My name is <color=#ff00ee>ABCD</color>
16: My name is <color=#ff00ee>ABCDE</color>
17: My name is <color=#ff00ee>ABCDE</color>
18: My name is <color=#ff00ee>ABCDE</color> a
19: My name is <color=#ff00ee>ABCDE</color> an
20: My name is <color=#ff00ee>ABCDE</color> and
21: My name is <color=#ff00ee>ABCDE</color> and
22: My name is <color=#ff00ee>ABCDE</color> and I
23: My name is <color=#ff00ee>ABCDE</color> and I
24: My name is <color=#ff00ee>ABCDE</color> and I l
25: My name is <color=#ff00ee>ABCDE</color> and I lo
26: My name is <color=#ff00ee>ABCDE</color> and I lov
27: My name is <color=#ff00ee>ABCDE</color> and I love
28: My name is <color=#ff00ee>ABCDE</color> and I love
29: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>m</color>
30: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>mu</color>
31: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>mus</color>
32: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>musi</color>
33: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>music</color>
34: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>music</color>
35: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>music</color>
36: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>music</color>
37: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>music</color>
38: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>music</color>
39: My name is <color=#ff00ee>ABCDE</color> and I love <color=#eeddff>music</color>
... and so on
Answered By: Anonymous
Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.