Skip to content
Fix Code Error

Best way to strip punctuation from a string

March 13, 2021 by Code Error
Posted By: Lawrence Johnston

It seems like there should be a simpler way than:

import string
s = "string. With. Punctuation?" # Sample string 
out = s.translate(string.maketrans("",""), string.punctuation)

Is there?

Solution

From an efficiency perspective, you’re not going to beat

s.translate(None, string.punctuation)

For higher versions of Python use the following code:

s.translate(str.maketrans('', '', string.punctuation))

It’s performing raw string operations in C with a lookup table – there’s not much that will beat that but writing your own C code.

If speed isn’t a worry, another option though is:

exclude = set(string.punctuation)
s = ''.join(ch for ch in s if ch not in exclude)

This is faster than s.replace with each char, but won’t perform as well as non-pure python approaches such as regexes or string.translate, as you can see from the below timings. For this type of problem, doing it at as low a level as possible pays off.

Timing code:

import re, string, timeit

s = "string. With. Punctuation"
exclude = set(string.punctuation)
table = string.maketrans("","")
regex = re.compile('[%s]' % re.escape(string.punctuation))

def test_set(s):
    return ''.join(ch for ch in s if ch not in exclude)

def test_re(s):  # From Vinko's solution, with fix.
    return regex.sub('', s)

def test_trans(s):
    return s.translate(table, string.punctuation)

def test_repl(s):  # From S.Lott's solution
    for c in string.punctuation:
        s=s.replace(c,"")
    return s

print "sets      :",timeit.Timer('f(s)', 'from __main__ import s,test_set as f').timeit(1000000)
print "regex     :",timeit.Timer('f(s)', 'from __main__ import s,test_re as f').timeit(1000000)
print "translate :",timeit.Timer('f(s)', 'from __main__ import s,test_trans as f').timeit(1000000)
print "replace   :",timeit.Timer('f(s)', 'from __main__ import s,test_repl as f').timeit(1000000)

This gives the following results:

sets      : 19.8566138744
regex     : 6.86155414581
translate : 2.12455511093
replace   : 28.4436721802
Answered By: Brian

Related Articles

  • Form field border-radius is not working only on the…
  • Trouble with boxes appearing/hiding based on selection
  • How to fix symbol lookup error: undefined symbol…
  • Backbone click event binding not being bound to DOM elements
  • data.table vs dplyr: can one do something well the…
  • Generating a drop down list of timezones with PHP
  • Calculate the mean by group
  • SQL query return data from multiple tables
  • Can't install via pip because of egg_info error
  • Jetpack Compose and Hilt Conflict
  • SQL JOIN and different types of JOINs
  • What's the difference between eval, exec, and compile?
  • Adding Google Translate to a web site
  • Remove specific characters from a string in Python
  • Rails wrong number of arguments error when…
  • how to use canvas in JavaScript flappy bird code
  • Validate that a string is a positive integer
  • Gradle error: Execution failed for task…
  • How to check if a string contains text from an array…
  • 'block in draw' rails 6 routes
  • Spark EMR job jackson error -…
  • Smart way to truncate long strings
  • problem with client server unix domain stream…
  • hide select options using only css
  • How do SO_REUSEADDR and SO_REUSEPORT differ?
  • Usage of __slots__?
  • What are the undocumented features and limitations…
  • $lookup on ObjectId's in an array
  • How to update Python?
  • Git command to show which specific files are ignored…
  • Keras Sequential API is replacing every layer with…
  • INNER JOIN vs LEFT JOIN performance in SQL Server
  • How do you easily create empty matrices javascript?
  • Remove characters except digits from string using Python?
  • How do I install Java on Mac OSX allowing version switching?
  • What are type hints in Python 3.5?
  • Pandas Merging 101
  • Why does C++ code for testing the Collatz conjecture…
  • org.gradle.api.tasks.TaskExecutionException:…
  • How to filter a RecyclerView with a SearchView
  • Reversing a string in C
  • How do I merge two dictionaries in a single…
  • Can't understand the difference between declaring a…
  • How does String substring work in Swift
  • In Jquery show selected a value from the searched…
  • Grouped bar plot in ggplot
  • What is a NullReferenceException, and how do I fix it?
  • python 3.2 UnicodeEncodeError: 'charmap' codec can't…
  • What is your most productive shortcut with Vim?
  • Reference - What does this regex mean?
  • why generic type information lost after second invocation?
  • Ukkonen's suffix tree algorithm in plain English
  • Active tab issue on page load HTML
  • How do I keep only the first map and when the game…
  • Maven Jacoco Configuration - Exclude…
  • How to show multiple select options when user select…
  • Best way to replace multiple characters in a string?
  • Using Auto Layout in UITableView for dynamic cell…
  • Django: ImproperlyConfigured: The SECRET_KEY setting…
  • SQL find sum of entries by date including previous date
  • What's the best way to set variables only if they…
  • How can i transtale name, address text in english…
  • For-each over an array in JavaScript
  • Memcached vs. Redis?
  • How to generate a random string of a fixed length in Go?
  • flutter - add network images in a pdf while creating…
  • Flappy bird code not working - JavaScript
  • Reset/remove CSS styles for element only
  • Removing a list of characters in string
  • php display array results by first letter from select option
  • What's the best way to get the last element of an…
  • Install opencv for Python 3.3
  • Cannot update nested dictionary properly
  • Setting DEBUG = False causes 500 Error
  • How to remove all the punctuation in a string? (Python)
  • ImportError: No module named dateutil.parser
  • Efficient Algorithm for Bit Reversal (from…
  • Install pip in docker
  • Are dictionaries ordered in Python 3.6+?
  • Homebrew install specific version of formula?
  • SAS macro loop aggregate output tables
  • Best practice multi language website
  • Using IS NULL or IS NOT NULL on join conditions -…
  • Python String function rstrip() not working
  • What is an optional value in Swift?
  • Does the join order matter in SQL?
  • SQL Server: Query fast, but slow from procedure
  • Logging best practices
  • How can I find the product GUID of an installed MSI setup?
  • COUNT(*) vs. COUNT(1) vs. COUNT(pk): which is better?
  • Start redis-server with config file
  • Improve INSERT-per-second performance of SQLite
  • Relative imports for the billionth time
  • Identifying and solving…
  • When to use LinkedList over ArrayList in Java?
  • How to do vlookup and fill down (like in Excel) in R?
  • How i can make Totals in difficult array
  • JQuery function not working after another row in my…
  • How does PHP 'foreach' actually work?
  • Running shell command and capturing the output

Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.

Post navigation

Previous Post:

How to change max_allowed_packet size

Next Post:

Metadata file ‘.dll’ could not be found

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

.net ajax android angular arrays aurelia backbone.js bash c++ css dataframe ember-data ember.js excel git html ios java javascript jquery json laravel linux list mysql next.js node.js pandas php polymer polymer-1.0 python python-3.x r reactjs regex sql sql-server string svelte typescript vue-component vue.js vuejs2 vuetify.js

  • you shouldn’t need to use z-index
  • No column in target database, but getting “The schema update is terminating because data loss might occur”
  • Angular – expected call-signature: ‘changePassword’ to have a typedeftslint(typedef)
  • trying to implement NativeAdFactory imports deprecated method by default in flutter java project
  • What should I use to get an attribute out of my foreign table in Laravel?
© 2022 Fix Code Error