c++ - Change String Array to tolower -
in program have text file read array tokenizes each word. need way can compare words words found in binary tree. issue is... duplicates of words not formatted same way (one uppercase , 1 lowercase) , need them can found in binary tree.
so question is: how change whole array lowercase?
here tried far:
#include <iostream> #include "binary_searchtree.h" #include "node.h" #include <string> #include <fstream> #include <sstream> using namespace std; const int size = 100; string myarray[size]; int main() { // first constructor used since empty binary_searchtree<string> *tree = new binary_searchtree<string>(); string token, lines; ifstream file("hashtags.txt"); while (getline(file, lines)){ tree -> insertnode(lines); } // convert strings in myarray all-lower myarray = tolower(myarray); // tokenize tweet array search ifstream tweet1("exampletweet.txt"); if(tweet1.is_open()) { while (getline(tweet1, token)){ for(int = 0; < size; ++i) { tweet1 >> myarray[i]; } } tweet1.close(); }
with c++11 , later, can downcase array of strings this:
#include <algorithm> #include <cctype> #include <string> std::string myarray[23]; // ... (std::string & s : myarray) std::transform(s.begin(), s.end(), s.begin(), [](unsigned char c) { return std::tolower(c); });
alternatively:
for (std::string & s : myarray) std::for_each(s.begin(), s.end(), [](char & c) { c = std::tolower(static_cast<unsigned char>(c)); });
or even:
for (std::string & s : myarray) (char & c : s) c = std::tolower(static_cast<unsigned char>(c));
if have c++98 support, use following loops:
for (std::size_t = 0; != 23; ++i) { std::string & s = myarray[i]; (std::string::iterator = s.begin(), e = s.end(); != e; ++it) { *it = std::tolower(static_cast<unsigned char>(*it)); } }
you idea.
don't forget convert character unsigned char
, since that's std::tolower
expects. (see this question discussion.) many c i/o functions expressed in terms of unsigned char
-converted-to-int
, since int
big enough represent values of unsigned char
plus additional out-of-band information, , char
, unsigned char
roundtrip convertible both ways layout-compatible.