Skip to content
Fix Code Error

How do I append one string to another in Python?

March 13, 2021 by Code Error
Posted By: Anonymous

I want an efficient way to append one string to another in Python, other than the following.

var1 = "foo"
var2 = "bar"
var3 = var1 + var2

Is there any good built-in method to use?

Solution

If you only have one reference to a string and you concatenate another string to the end, CPython now special cases this and tries to extend the string in place.

The end result is that the operation is amortized O(n).

e.g.

s = ""
for i in range(n):
    s+=str(i)

used to be O(n^2), but now it is O(n).

From the source (bytesobject.c):

void
PyBytes_ConcatAndDel(register PyObject **pv, register PyObject *w)
{
    PyBytes_Concat(pv, w);
    Py_XDECREF(w);
}


/* The following function breaks the notion that strings are immutable:
   it changes the size of a string.  We get away with this only if there
   is only one module referencing the object.  You can also think of it
   as creating a new string object and destroying the old one, only
   more efficiently.  In any case, don't use this if the string may
   already be known to some other part of the code...
   Note that if there's not enough memory to resize the string, the original
   string object at *pv is deallocated, *pv is set to NULL, an "out of
   memory" exception is set, and -1 is returned.  Else (on success) 0 is
   returned, and the value in *pv may or may not be the same as on input.
   As always, an extra byte is allocated for a trailing  byte (newsize
   does *not* include that), and a trailing  byte is stored.
*/

int
_PyBytes_Resize(PyObject **pv, Py_ssize_t newsize)
{
    register PyObject *v;
    register PyBytesObject *sv;
    v = *pv;
    if (!PyBytes_Check(v) || Py_REFCNT(v) != 1 || newsize < 0) {
        *pv = 0;
        Py_DECREF(v);
        PyErr_BadInternalCall();
        return -1;
    }
    /* ✘✘✘ UNREF/NEWREF interface should be more symmetrical */
    _Py_DEC_REFTOTAL;
    _Py_ForgetReference(v);
    *pv = (PyObject *)
        PyObject_REALLOC((char *)v, PyBytesObject_SIZE + newsize);
    if (*pv == NULL) {
        PyObject_Del(v);
        PyErr_NoMemory();
        return -1;
    }
    _Py_NewReference(*pv);
    sv = (PyBytesObject *) *pv;
    Py_SIZE(sv) = newsize;
    sv->ob_sval[newsize] = '';
    sv->ob_shash = -1;          /* invalidate cached hash value */
    return 0;
}

It’s easy enough to verify empirically.

$ python -m timeit -s"s=''" "for i in xrange(10):s+='a'"
1000000 loops, best of 3: 1.85 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(100):s+='a'"
10000 loops, best of 3: 16.8 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
10000 loops, best of 3: 158 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
1000 loops, best of 3: 1.71 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 14.6 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000000):s+='a'"
10 loops, best of 3: 173 msec per loop

It’s important however to note that this optimisation isn’t part of the Python spec. It’s only in the cPython implementation as far as I know. The same empirical testing on pypy or jython for example might show the older O(n**2) performance .

$ pypy -m timeit -s"s=''" "for i in xrange(10):s+='a'"
10000 loops, best of 3: 90.8 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(100):s+='a'"
1000 loops, best of 3: 896 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
100 loops, best of 3: 9.03 msec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
10 loops, best of 3: 89.5 msec per loop

So far so good, but then,

$ pypy -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 12.8 sec per loop

ouch even worse than quadratic. So pypy is doing something that works well with short strings, but performs poorly for larger strings.

Answered By: Anonymous

Related Articles

  • Attaching textarea by v-model to computed with other…
  • How would I access variables from one class to another?
  • Multiple left-hand assignment with JavaScript
  • What's the difference between eval, exec, and compile?
  • What is the origin of foo and bar?
  • Python vs Cpython
  • How to make Nuxt global object?
  • Usage of __slots__?
  • Can't install via pip because of egg_info error
  • Split (explode) pandas dataframe string entry to…
  • Validate that a string is a positive integer
  • MySQL: @variable vs. variable. What's the difference?
  • Generate sequence of dates for given frequency as…
  • What is your most productive shortcut with Vim?
  • Are dictionaries ordered in Python 3.6+?
  • ValueError from Function
  • Why am I not able to exclude __pycache__ directory…
  • Using the && operator in an if statement
  • Setting a global PowerShell variable from a function…
  • problem with client server unix domain stream…
  • Smart way to truncate long strings
  • How to check if a string contains text from an array…
  • How to remove last n characters from a string in Bash?
  • Python, Username and Password with 3 attempts
  • Python 3: UnboundLocalError: local variable…
  • String concatenation with Groovy
  • How to concatenate cell values until it finds a…
  • R, 3-way table, how to order
  • How does PHP 'foreach' actually work?
  • ng-options with simple array init
  • AHK How to pass variables to "Run" inside a function?
  • Reversing a string in C
  • How do SO_REUSEADDR and SO_REUSEPORT differ?
  • Creating an instance of class
  • data.table: recode several variables without having…
  • pandas: best way to select all columns whose names…
  • npm install error in vue
  • How does String substring work in Swift
  • Multiple returns from a function
  • How do I include certain conditions in SQL Count
  • Is this request generated by EF Core buggy or is it my code?
  • How to create range in Swift?
  • php: loop through json array
  • Can't understand the difference between declaring a…
  • Difference between variable declaration syntaxes in…
  • How can I manually compile a svelte component down…
  • Make: "nothing to be done for target" when invoking…
  • What are type hints in Python 3.5?
  • What does "Fatal error: Unexpectedly found nil while…
  • I don't know why this python code for replacing all…
  • R apply function with multiple parameters
  • Why is there no xrange function in Python3?
  • How to sort an array in descending order in Ruby
  • Format certain floating dataframe columns into…
  • How can I create a Promise in TypeScript from a union type
  • How to avoid using Select in Excel VBA
  • How to split one string into multiple variables in…
  • Class vs. static method in JavaScript
  • How do i download a zip with multiple types of files…
  • Why cat does not work with parameter -0 in xargs?
  • Vue.js - prop sync not instant
  • What does Ruby have that Python doesn't, and vice versa?
  • Why does C++ code for testing the Collatz conjecture…
  • Unexpected end of JSON input while parsing
  • How do I copy a range of formula values and paste…
  • Understanding PrimeFaces process/update and JSF…
  • Polymer events from distant nodes
  • What are the currently supported CSS selectors…
  • Makefile ifeq logical or
  • How to type mutable default arguments
  • data.table vs dplyr: can one do something well the…
  • What is the best way to declare global variable in Vue.js?
  • python 3.2 UnicodeEncodeError: 'charmap' codec can't…
  • Improved way to get subsets of a list
  • How to insert data in only one type of Vector in…
  • Can you pre-load related data such that your…
  • Excel VBA For Each Worksheet Loop
  • How to use variables in SQL statement in Python?
  • Export multiple classes in ES6 modules
  • How to implement the factory method pattern in C++ correctly
  • Text Progress Bar in the Console
  • Equals implementation with override Equals,…
  • Excel VBA: AutoFill Multiple Cells with Formulas
  • Linux bash: Multiple variable assignment
  • What is causing this broken animation/transition in…
  • How can I switch to another branch in git?
  • What is a NullReferenceException, and how do I fix it?
  • How to select rows from a DataFrame based on column values
  • How to use Rust nom to write a parser for this kind…
  • C++ Compare char array with string
  • Order the legend names in ggplot2 object from…
  • Bash script shell input variables
  • Understanding repr( ) function in Python
  • Callback functions in C++
  • How do I merge two dictionaries in a single…
  • How to generate a random string of a fixed length in Go?
  • How do I return the response from an asynchronous call?
  • Aurelia not dirty checking function result in HTML…
  • Why is "1000000000000000 in range(1000000000000001)"…
  • Command not found error in Bash variable assignment

Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.

Post navigation

Previous Post:

List changes unexpectedly after assignment. How do I clone or copy it to prevent this?

Next Post:

Get list from pandas DataFrame column headers

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

.net ajax android angular arrays aurelia backbone.js bash c++ css dataframe ember-data ember.js excel git html ios java javascript jquery json laravel linux list mysql next.js node.js pandas php polymer polymer-1.0 python python-3.x r reactjs regex sql sql-server string svelte typescript vue-component vue.js vuejs2 vuetify.js

  • you shouldn’t need to use z-index
  • No column in target database, but getting “The schema update is terminating because data loss might occur”
  • Angular – expected call-signature: ‘changePassword’ to have a typedeftslint(typedef)
  • trying to implement NativeAdFactory imports deprecated method by default in flutter java project
  • What should I use to get an attribute out of my foreign table in Laravel?
© 2022 Fix Code Error