How to Setup VyprVPN on the Raspberry Pi

In this tutorial, I will be going through all the steps to setting up Raspberry Pi VyprVPN.

Raspberry Pi VyprVPN

This tutorial is handy if you’re looking to connect your Pi to the VyprVPN service.

There are many reasons why you may want to set up a VPN on the Raspberry Pi. The most common is that you want an extra layer of security and anonymity to your network activities. These benefits are handy for a range of different Raspberry Pi projects.

Most of our projects have been tested for the latest version of Raspbian. I recommend upgrading to the most recent for the best experience when following this tutorial.

If VyprVPN doesn’t take your fancy, then we do have other tutorials that cover services such as ExpressVPN or NordVPN.

You can find the tutorial right below if you have any issues then be sure to let us know over at our forum.

 Equipment

All the equipment that you need to set up this Raspberry Pi VyprVPN tutorial is listed right below.

Recommended

 Raspberry Pi

 Micro SD Card

 Ethernet Cable or WiFi dongle (Pi 3 has WiFi inbuilt)

 Power Adapter

 VyprVPN Subscription

Optional

 Raspberry Pi Case

 USB Keyboard

 USB Mouse

 Installing VyprVPN to the Raspberry Pi

VyprVPN isn’t much different to installing most VPN services on the Raspberry Pi as most make use of the OpenVPN software.

1. If you haven’t already, then you will need to sign up to VyprVPN.

2. Load the terminal on the Raspberry Pi or make use of SSH to remotely it access.

3. Update the Raspbian to the latest packages.

sudo apt-get update
sudo apt-get upgrade

4. Now, let’s install the OpenVPN package, you can do this by entering the following command.

sudo apt-get install openvpn

5. Change directory to the OpenVPN directory by entering the following.

cd /etc/openvpn/

6. We will now need to download the VyprVPN ovpn files.

sudo wget -O vyprvpn.zip \
https://support.goldenfrog.com/hc/article_attachments/360008728172/GF_OpenVPN_10142016.zip

7. Next, we will now need to extract the files that we need.

sudo unzip vyprvpn.zip

8. Now let’s move all the files to the base directory and delete VyprVPN directory.

sudo mv /etc/openvpn/OpenVPN256/* /etc/openvpn/
sudo rm -r /etc/openvpn/OpenVPN256

9. To connect to VyprVPN simply use the following command.

sudo openvpn file_name

Replace file_name with the location of where you wish to connect. For example, If I wanted Canada for example, then I will use Canada.ovpn. You can view all the locations by using the following command.

ls -l /etc/openvpn

Below is an example of connecting to Canada.

sudo openvpn /etc/openvpn/Canada.ovpn

10. It will now ask for your credentials, and you will need to enter them to be able to connect to VyprVPN. Test your connection by going ipleak.net. You should have a different IP to your usual one.

11. If you need to disconnect, then you can easily use either ctrl+c or the following command.

sudo killall openvpn

 Auto Start VyprVPN

Most of us love to reduce the amount of manual input required for when it comes to technology. The following steps will show you how to set up VyprVPN to connect automatically on bootup.

1. Firstly, we will need to save both our username and password in a file.

sudo nano /etc/openvpn/auth.txt

2. In this file, add your chosen username and password for the service. Make sure the username and password are both on separate lines.

username
password

3. Save and exit by pressing ctrl+x, then y and lastly enter.

4. Now we will need to copy the ovpn file, simplify its name at the same time.

sudo cp "/etc/openvpn/Australia - Sydney.ovpn" /etc/openvpn/aussyd.conf

5. Now let’s edit this new file.

sudo nano /etc/openvpn/aussyd.conf

6. We will only need to do a straightforward edit in this file.

Find

auth-user-pass

Replace with

auth-user-pass auth.txt

7. Finally, we need to setup OpenVPN to auto start using our ovpn file.

sudo nano /etc/default/openvpn

Find

#AUTOSTART="all"

Replace with

AUTOSTART="aussyd"

Replace aussyd with the filename you set.

8. Save and exit.

9. Reboot the Raspberry Pi to test out our new configuration.

sudo reboot

10. Now test the VPN by going to ipleak.net or a similar website. The IP should be VyprVPNs and not your own. Doing this step will confirm that we have successfully set up VyprVPN on the Raspberry Pi.

 Preventing DNS Leaks

To ensure that your DNS isn’t leaking your location you will need to do a tweak on your Pi. To fix this, we will simply force our DNS to run through Cloudflare’s public DNS rather than our internet service providers (ISP) DNS. This process is pretty easy and won’t take long to do.

1. Firstly, load into the dhcpcd configuration file and update the following line.

Open

sudo nano /etc/dhcpcd.conf

Find

#static domain_name_servers=192.168.0.1

Replace with

static domain_name_servers=1.1.1.1

2. Save & exit the file.

3. Now reboot your Pi by entering the following command.

sudo reboot

4. Go to ipleak.net and check that your DNS is no longer leaking. If you’re still leaking. then you might want to look at this page on WebRTC requests for more information.

 Troubleshooting

If you run into trouble while setting up Raspberry Pi VyprVPN then the troubleshooting tips might help you out.

  • You’re able to start and stop your VPN by using the following command. Replacing stop with start will start the VPN backup. This command will only work if you have it set up for autostart.
sudo systemctl stop openvpn
  • It’s important to be aware that we are storing credentials in plain text. This lack of security makes it essential that you keep your Pi secure against unauthorized access. Just changing the default password will heavily improve your security.

As I mentioned above, there is plenty of other projects that work great with a VPN. Something as simple as a Torrentbox will benefit. Just make sure your VPN provider allows torrenting as some will ban you for using up too much bandwidth.

Hopefully, by the end of this Raspberry Pi VyprVPN tutorial, you have everything set up and working as it should be. If you require further help, then I highly recommend that you leave a comment.

One Program Written in Python, Go, and Rust

Python, Go, Rust mascots

Update (2019-07-04): Some kind folks have suggested changes on the implementations to make them more idiomatic, so the code here may differ from what’s currently in the repos.


This is a subjective, primarily developer-ergonomics-based comparison of the three languages from the perspective of a Python developer, but you can skip the prose and go to the code samplesthe performance comparison if you want some hard numbers, the takeaway for the tl;dr, or the PythonGo, and Rust diffimg implementations.

A few years ago, I was tasked with rewriting an image processing service. To tell whether my new service was creating the same output as the old given an image and one or more transforms (resize, make a circular crop, change formats, etc.), I had to inspect the images myself. Clearly I needed to automate this, but I could find no existing Python library that simply told me how different two images were on a per-pixel basis. Hence diffimg, which can give you a difference ratio/percentage, or generate a diff image (check out the readme to see an example).

The initial implementation was in Python (the language I’m most comfortable in), with the heavy lifting done by Pillow. It’s usable as a library or a command line tool. The actual meatof the program is very small, only a few dozen lines, thanks to Pillow. Not a lot of effort went into building this tool (xkcd was right, there’s a Python module for nearly everything), but it’s at least been useful for a few dozen people other than myself.

A few months ago, I joined a company that had several services written in Go, and I needed to get up to speed quickly on the language. Writing diffimg-go seemed like an fun and possibly even useful way to do this. Here are a few points of interest that came out of the experience, along with some that came up while using it at work:

Comparing Python and Go

(Again, the code: diffimg (python) and diffimg-go)

  • Standard Library: Go comes with a decent image standard library module, as well as a command line flag parsing library. I didn’t need to look for any external dependencies; the diffimg-go implementation has none, where the Python implementation uses the fairly heavy third party module (ironically) named Pillow. Go’s standard library in general is more structured and well thought out, while Python’s is organically evolved, created by many authors over years, with many differing conventions. The Go standard library’s consistency makes it easier to predict how any given module will function, and the source code is extremely well documented.
    • One downside of using the standard image library is that it does not automatically detect if the image has an alpha channel; pixel values have four channels (RGBA) for all image types. The diffimg-go implementation therefore requires the user to indicate whether or not they want to use the alpha channel. This small inconvenience wasn’t worth finding a third party library to fix.
    • One big upside is that there’s enough in the standard library that you don’t need a web framework like Django. It’s possible to build a real, usable web service in Go without any dependencies. Python’s claim is that it’s batteries-included, but Go does it better, in my opinion.
  • Static Type System: I’ve used statically typed languages in the past, but my programming for the past few years has mostly been in Python. The experience was somewhat annoying at first, it felt as though it was simply slowing me down and forcing me to be excessively explicit whereas Python would just let me do what I wanted, even if I got it wrong occasionally. Somewhat like giving instructions to someone who always stops you to ask you to clarify what you mean, versus someone who always nods along and seems to understand you, though you’re not always sure they’re absorbing everything. It will decrease the amount of type-related bugs for free, but I’ve found that I still need to spend nearly the same amount of time writing tests.
    • One of the common complaints of Go is that it does not have user-implementable generic types. While this is not a must-have feature for building a large, extensible application, it certainly slows development speed. Alternative patterns have been suggested, but none of them are as effective as having real generic types.
    • One upside of the static type system is that it reading through an unfamiliar codebase is easier and faster. Good use of types imbues a lot of extra information that is lost with a dynamic type system.
  • Interfaces and Structs: Go uses interfaces and structs where Python would use classes. This was probably the most interesting difference to me, as it forced me to differentiate the concept of a type that defines behavior versus a type that holds information. Python and other “traditionally object-oriented” languages would encourage you to mash these together, but there are pros and cons to both paradigms:
    • Go heavily encourages composition over inheritance. While it has inheritance via embedding, without classes, it’s not as easy to forward both data and methods. I generally agree that composition is the better default pattern to reach for, but I’m not an absolutist and some situations are a better fit for inheritance, so I’d prefer not to have the language make this decision for me.
    • Divorcing implementations for interfaces means you need to write similar code several times if you have many types that are similar to each other. Because of the lack of generic types, there are situations in Go where I wouldn’t be able to reuse code, though I would in Python.
    • However, because Go is statically typed, the compiler/linter will tell you when you’re writing code that would have caused a runtime error in Python when you try to access a method or attribute that may not exist. Python linters can get a bit of this functionality, but because of the language’s dynamicity, the linter can’t know exactly what methods/attributes will exist until runtime. Statically defined interfaces and structs are the only way to know what’s available at compile time and during development, making Go that compiles more trustworthy than Python that runs.
  • No Optional Arguments: Go only has variadic functions which are similar to Python’s keyword arguments, but less useful, since the arguments need to be of the same type. I found keyword arguments to be something I really missed, mainly for how much easier refactoring is if you can just throw a kwarg of any type onto whatever function needs it without having to rewrite every one of its calls. I use this feature quite often in at work, it’s saved me a lot of time over the years. Not having the feature made my implementation for how to handle whether or not the diff image should be created based on the command line flags somewhat clumsy.
  • Verbosity: Go is a bit more verbose (though not Java verbose). Part of that is because type system does not have generics, but mainly the fact that the language itself is very small and not heavily loaded with features (you only get one looping construct!). I missed having Python’s list comprehensions and other functional programming features. If you’re comfortable with Python, you can go through the Tour of Go in a day or two, and you’ll have been exposed to the entirety of the language.
  • Error Handling: Python has exceptions, whereas Go propagates errors by returning tuples: value, error from functions wherever something may go wrong. Python lets you catch errors at any point in the call stack as opposed to requiring you to manually pass them back up over and over again. This again results in brevity and code that isn’t littered with Go’s infamous if err != nil pattern, though you do need to be aware of what possible exceptions can be thrown by a function and all(!) of its internal calls (using except Exception: is a usually-bad-practice workaround for this). Good docstrings and tests can help here, which you should be writing in either language. Go’s system is definitely safer. You’re still allowed to shoot yourself in the foot by ignoring the err value, but the system makes it obvious that this is a bad idea.
  • Third Party Modules: Prior to Go modules, Go’s package manager would just throw all downloaded packages into $GOPATH/src instead of the project’s directory (like most other languages). The path for these modules inside $GOPATH would also be built from the URL where the package is hosted, so your import would look something like import "github.com/someuser/somepackage". Embedding github.cominside the source code of almost all Go codebases seems like a strange choice. In any case, Go now allows the conventional way of doing things, but Go modules are still new so this quirk will remain common in wild Go code for some time.
  • Asynchronicity: Goroutines are a very convenient way to fire off asynchronous tasks. Before async/await, Python’s asynchronous solutions were somewhat hairy. Unfortunately I haven’t written much real-world async code in Python or Go, and the simplicity of diffimg didn’t seem to lend itself to the added overhead of asynchronicity, so I don’t have too much to say here, though I do like Go’s channels as a way to handle multiple async tasks. My understanding is that for performance, Go still has the upper hand here as goroutines can make use of full multiprocessor parallelism, where Python’s basic async/await is still stuck on one processor, so mainly useful for I/O bound tasks.
  • Debugging: Python wins. pdb (and more sophisticated options like ipdb are available) is extremely flexible, once you’ve entered the REPL, you’re able to write whatever code you want. Delve is a good debugger, but it’s not the same as dropping straight into an interpreter, the full power of the language at your fingertips.

Go Summary

My initial impression of Go is that because its ability to abstract is (purposely) limited, it’s not as fun a language as Python is. Python has more features and thus more ways of doing something, and it can be a lot of fun to find the fastest, most readable, or “cleverest” solution. Go actively tries to stop you from being “clever.” I would go as far as saying that Go’s strength is that it’s not clever.

Its minimalism and lack of freedom are constraining as a single developer just trying to materialize an idea. However, this weakness becomes its strength when the project scales to dozens or hundreds of developers – because everyone’s working with the same small toolset of language features, it’s more likely to be uniform and thus understandable by others. It’s still very possible to write bad Go, but it’s more difficult to create monstrosities that more “powerful” languages will let you produce.

After using it for a while, it makes sense to me why a company like Google would want a language like this. New engineers are being introduced to enormous codebases constantly, and in a messier/more powerful language and under the pressure of deadlines, complexity could be introduced faster than it can be removed. The best way to prevent that is with a language that has less capacity for it.

With that said, I’m happy to work on a Go codebase in the context of a large application with a diverse and ever-growing team. In fact, I think I’d prefer it. I just have no desire to use it for my own personal projects.

Enter Rust

A few weeks ago, I decided to give an honest go at learning Rust. I had attempted to do so before but found the type system and borrow checker confusing and without enough context for why all these constraints were being forced on me, cumbersome for the tasks I was trying to do. However, since then, I’ve learned a bit more about what happens with memory during the execution of a program. I also started with the book instead of just attempting to dive in headfirst. This was massively helpful, and probably the best introduction to any programming language I’ve ever experienced.

After I had gone through the first dozen or so chapters of the book, I felt confident enough to try another implementation of diffimg (at this point, I had about as much experience with Rust as I’d had with Go when I wrote diffimg-go). It took me a bit longer to write than the Go implementation, which itself took longer than Python. I think this would be true even taking into account my greater comfort with Python – there’s just more to write in both languages.

Some of the things that I took notice of when writing diffimg-rs:

  • Type System: I was comfortable with the more basic static type system of Go by now, but Rust’s is significantly more powerful (and complicated). Generic types, enumerated types, traits, reference types, lifetimes are all additional concepts that I had to learn on top of Go’s much simpler interfaces and structs. Additionally, Rust uses its type system to implement features that other languages don’t use the type system for (example: the Result type, which I’ll talk about soon). Luckily, the compiler/linter is extremely helpful in telling you what you’re doing wrong, and often even tells you exactly how to fix it. Despite this, I’ve spent significantly more time than I did learning Go’s type system and I’m still not comfortable with all the features yet.
    • There was one place where because of the type system, the implementation of the imaging library I was using would have led to an uncomfortable amount of code repetition. I only ended up matching the two most important enum types, but matching the others would lead another half dozen or so lines of nearly identical code. At this scale it’s not an issue, but it rubs me the wrong way. Maybe it’s a good candidate for using macros, which I still need to experiment with.
      let mut diff = match image1.color() {
          image::ColorType::RGB(_) => image::DynamicImage::new_rgb8(w, h),
          image::ColorType::RGBA(_) => image::DynamicImage::new_rgba8(w, h),
          // keep going for all 7 types?
          _ => return Err(
              format!("color mode {:?} not yet supported", image1.color())
          ),
      };
      
  • Manual Memory Management: Python and Go pick up your trash for you. C lets you litter everywhere, but throws a fit when it steps on your banana peel. Rust slaps you and demands that you clean up after yourself. This stung at first, since I’m spoiled and usually have my languages pick up after me, moreso even than moving from a dynamic to a statically typed language. Again, the compiler tries to help you as much as is possible, but there’s still a good amount of studying you’ll need to do to understand what’s really going on.
    • One nice part about having such direct access to the memory (and the functional programming features of Rust) is that it simplified the difference ratio calculationbecause I could simply map over the raw byte arrays instead of having to index each pixel by coordinate.
  • Functional Features: Rust strongly encourages a functional approach: it has a FP-friendly type system like Haskell, immutable types, closures, iterators, pattern matching, and more, but also allows imperative code. It’s similar to writing OCaml (interestingly, the original Rust compiler was written in OCaml). Because of this, code is more concise than you’d expect for a language that competes with C.
  • Error Handling: Instead of the exception model that Python uses or the tuple returns that Go uses for error handling, Rust makes use of its enumerated types: Resultreturns either Ok(value) or Err(error). This is closer to Go’s way if you squint, but is a bit more explicit and leverages the type system. There’s also syntactic sugar for checking a statement for an Err and returning early: the ? operator (Go could use something like this, IMO).
  • Asynchronicity: Async/await hasn’t quite landed for Rust yet, but the final syntax has recently been agreed upon. Rust also has some basic threading features in the standard library that seem a bit easier to use than Python’s, but I haven’t spent much time with it. Go still seems to have the best offerings here.
  • Toolingrustup and cargo are extremely polished implementations of a language version manager and package/module manager, respectively. Everything “just works.” I especially love the autogenerated docs. The Python options for these are somewhat organic and finicky, and as I mentioned before, Go has a strange way of managing modules, though aside from that, its tooling is in a much better state than Python’s.
  • Editor Plugins: My .vimrc is embarrassingly large, with at least three dozen plugins. I have some plugins for linting, autocompleting, and formatting both Python and Go, but the Rust plugins were easier to set up, more helpful, and more consistent compared to the other two languages. The rust.vim and vim-lsp plugins (along with the Rust Language Server) were all I needed to get an extremely powerful configuration. I haven’t tested out other editors with Rust but with the excellent editor-agnostic tooling that Rust comes with, I’d expect them to be just as helpful. The setup provides the best go-to-definition I’ve ever used. It works perfectly on local, standard library, and third-party code out of the box.
  • Debugging: I haven’t tried out a debugger with Rust yet (since the type system andprintln! take you pretty far), but you can use rust-gdb and rust-lldb, wrappers around the gdb and lldb debuggers that are installed with the initial rustup. The experience should be predictable if you’ve used those debuggers before with C. As mentioned previously, the compiler error messages are extremely helpful.

Rust Summary

I definitely wouldn’t recommend attempting to write Rust without at least going through the first few chapters of the book, even if you’re already familiar with C and memory management. With Go and Python, as long as you have some experience with another modern imperative programming language, they’re not difficult to just start writing, referring to the docs when necessary. Rust is a large language. Python also has a lot of features, but they’re mostly opt-in. You can get a lot done just by understanding a few primitive data structures and some builtin functions. With Rust, you really need to understand the complexity inherent to the type system and borrow checker, or you’re going to be getting tangled up a lot.

As far as how I feel when I write Rust, it’s a lot of fun, like Python. Its breadth of features makes it very expressive. While the compiler stops you a lot, it’s also very helpful, and its suggestions on how to solve your borrowing/typing problems usually work. The tooling as I’ve mentioned is the best I’ve encountered for any language and doesn’t bring me a lot of headaches like some other languages I’ve used. I really like using the language and will continue to look for opportunities to do so, where the performance of Python isn’t good enough.

Code Samples

I’ve extracted the chunks of each diffimg which calculate the difference ratio. To summarize how it works for Python, this takes the diff image generated by Pillow, sums the values of all channels of all pixels, and returns the ratio produced by dividing the maximum possible value (a pure white image of the same size) by this sum.

Python:


diff_img = ImageChops.difference(im1, im2)
stat = ImageStat.Stat(diff_img)
sum_channel_values = sum(stat.mean)
max_all_channels = len(stat.mean) * 255
diff_ratio = sum_channel_values / max_all_channels

For Go and Rust, the method is a little different: Instead of creating a diff image, we just loop over both input images and keep a running sum of the differences of each pixel. In Go, we’re indexing into each image by coordinate…

Go:


func GetRatio(im1, im2 image.Image, ignoreAlpha bool) float64 {
  var sum uint64
  width, height := getWidthAndHeight(im1)
  for y := 0; y < height; y++ {
    for x := 0; x < width; x++ {
      sum += uint64(sumPixelDiff(im1, im2, x, y, ignoreAlpha))
    }
  }
  var numChannels = 4
  if ignoreAlpha {
    numChannels = 3
  }
  totalPixVals := (height * width) * (maxChannelVal * numChannels)
  return float64(sum) / float64(totalPixVals)
}

… but in Rust, we’re treating the images as what they really are in memory, a series of bytes that we can just zip together and consume.

Rust:


pub fn calculate_diff(
    image1: DynamicImage,
    image2: DynamicImage
  ) -> f64 {
  let max_val = u64::pow(2, 8) - 1;
  let mut diffsum: u64 = 0;
  for (&p1, &p2) in image1
      .raw_pixels()
      .iter()
      .zip(image2.raw_pixels().iter()) {
    diffsum += u64::from(abs_diff(p1, p2));
  }
  let total_possible = max_val * image1.raw_pixels().len() as u64;
  let ratio = diffsum as f64 / total_possible as f64;

  ratio
}

Some things to take note of in these examples:

  • Python has the least code by far. Obviously, it’s leaning heavily on features of the image library it’s using, but this is indicative of the general experience of using Python. In many cases, a lot of the work has been done for you because the ecosystem is so developed that there are mature pre-existing solutions for everything.
  • There’s type conversion in the Go and Rust examples. In each block there are three numerical types being used: uint8/u8 for the pixel channel values (the type is inferred in both Go and Rust, so you don’t see any explicit mention of either type),uint64/u64 for the sum, and float64/f64 for the final ratio. For Go and Rust, there was time spent getting the types to line up, whereas Python converts everything implicitly.
  • The Go implementation’s style is very imperative, but also explicit and understandable (minus the ignoreAlpha part I mentioned earlier), even to those unaccustomed to the language. The Python example is fairly clear as well, once you understand what ImageStat is doing. Rust is definitely murkier to those unfamiliar with the language:
    • .raw_pixels() gets the image as a vector of unsigned 8-bit integers.
    • .iter() creates an iterator for that vector. Vectors by default are not iterable.
    • .zip() you may be familiar with, it takes two iterators and produces one, with each element being a tuple: (element from first vector, element from second vector).
    • We need a mut in our diffsum declaration because by default, variables are immutable.
    • If you’re familiar with C you can probably figure out why we have the &s in for (&p1, &p2): The iterator produces references to the pixel values, but abs_diff() takes the values themselves. Go supports pointers (which are not quite the same as references), but they’re not as commonly used as references are in Rust.
    • The last statement in a function is used as the return value if there isn’t a line-ending ;. A few other functional languages do this as well.

    This snippet gives you some insight into how much language-specific knowledge you’ll need to pick up to be effective in Rust.

Performance

Now for something resembling a scientific comparison. I first generated three random images of different sizes: 1×1, 2000×2000, and 10,000×10,000. Then I measured each (language, image size) combination’s performance 10 times for each diffimg ratio calculation and averaged them, using the values given by the real values from the timecommand. diffimg-rs was built using --releasediffimg-go with just go build, and the Python diffimg invoked with python3 -m diffimg. The results, on a 2015 Macbook Pro:

Image size: 1×1 2000×2000 10,000×10,000
Rust 0.001s 0.490s 5.871s
Go 0.002s (2x) 0.756s (1.54x) 14.060s (2.39x)
Python 0.095s (95x) 1.419s (2.90x) 28.751s (4.89x)

I’m losing a lot of precision because time only goes down to 10ms resolution (one more digit is shown here because of the averaging). The task only requires a very specific type of calculation as well, so a different or more complex one could have very different numbers. Despite these caveats, we can still learn something from the data.

With the 1×1 image, virtually all the time is spent in setup, not ratio calculation. Rust wins, despite using two third-party libraries (clap and image) and Go only using the standard library. I’m not surprised Python’s startup is as slow as it is, since importing a large library (Pillow) is one of its steps, and even just time python -c '' takes 0.030s.

At 2000×2000, the gap narrows for both Go and Python compared to Rust, presumably because less of the overall time is spent in setup compared to calculation. However, at 10,000×10,000, Rust is more performant in comparison, which I would guess is due to its compiler’s optimizations producing the smallest block of machine code that is looped through 100,000,000 times, dwarfing the setup time. Never needing to pause for garbage collection could also be a factor.

The Python implementation definitely has room for improvement, because as efficient as Pillow is, we’re still creating a diff image in memory (traversing both input images) and then adding up each of its pixel’s channel values. A more direct approach like the Go and Rust implementations would probably be marginally faster. However, a pure Python implementation would be wildly slower, since Pillow does its main work in C. Because the other two are pure language implementations, this isn’t really a fair comparison, though in some ways it is, because Python has an absurd amount of libraries available to you that are performant thanks to C extensions (and Python and C have a very tight relationship in general).

I should also mention the binary sizes: Rust’s is 2.1mb with the --release build, and Go’s is comparable at 2.5mb. Python doesn’t create binaries, but .pyc files are sort ofcomparable, and diffimg’s .pyc files are about 3kb in total. Its source code is also only about 3kb, but including the Pillow dependency, it weighs in at 24mb(!). Again, not a fair comparison because I’m using a third party imaging library, but it should be mentioned.

The Takeaway

Obviously, these are three very different languages fulfilling different niches. I’ve heard Go and Rust often mentioned together, but I think Go and Python are the two more similar/competing languages. They’re both good for writing server-side application logic (what I spend most of my time doing at work). Comparing just native code performance, Go blows Python away, but many of Python’s libraries that require speed are wrappers around fast C implementations – in practice, it’s more complicated than a naive comparison. Writing a C extension for Python doesn’t really count as Python anymore (and then you’ll need to know C), but the option is open to you.

For your backend server needs, Python has proven itself to be “fast enough” for most applications, though if you need more performance, Go has it. Rust even more so, but you pay for it with development time. Go is not far off from Python in this regard, though it certainly is slower to develop, primarily due to its small feature set. Rust is very fully featured, but managing memory will always take more time than having the language do it, and this outweighs having to deal with Go’s minimality.

It should also be mentioned that there are many, many Python developers in the world, some with literally decades of experience. It will likely never be hard to find more people with language experience to add to your backend team if you choose Python. However, Go developers are not particularly rare, and can easily be created because the language is so easy to learn. Rust developers are both rarer and harder to make since the language takes longer to internalize.

With respect to the type systems: static type systems make it easier to write more correct code, but it’s not a panacea. You still need to write comprehensive tests no matter the language you use. It requires a bit more discipline, but I’ve found that the code I write in Python is not necessarily more error prone than Go as long as I’m able to write a good suite of tests. That said, I much prefer Rust’s type system to Go’s: it supports generics, pattern matching, handles errors, and just does more for you in general.

In the end, this comparison is a bit silly, because though the use cases of these languages overlap, they occupy very different niches. Python is high on the development-speed, low on the performance scale, while Rust is the opposite, and Go is in the middle. I enjoy writing Python and Rust more than Go (this may be unsurprising), though I’ll continue to use Go at work happily (along with Python) since it really is a great language for building stable and maintainable applications with many contributors from many backgrounds. Its inflexibility and minimalism which makes it less enjoyable to use (for me) becomes its strength here. If I had to choose the language for the backend of a new web application, it would be Go.

I’m pretty satisfied with the range of programming tasks that are covered by these three languages – there’s virtually no project that one of them wouldn’t be a great choice for.

Top 10 Python Libraries You Must Know in 2019

In this article, we will discuss some of the top libraries in Python that can be used by developers to prase, clean, and represent data and implement machine learning in their existing applications.

We will be considering the following 10 libraries:

  • TensorFlow
  • Scikit-Learn
  • Numpy
  • Keras
  • PyTorch
  • LightGBM
  • Eli5
  • SciPy
  • Theano
  • Pandas

Image title

Introduction

Python is one of the most popular and widely used programming languages and has replaced many programming languages in the industry.

There are many reasons why Python is popular among developers. However, one of the most significant is its large collection of libraries that users can work with.

The simplicity of Python has attracted many developers to create new libraries for machine learning. Because of the huge collection of libraries, Python is becoming hugely popular among machine learning experts.

So, the first library is TensorFlow.

TensorFlow

Top 10 Python Libraries - Edureka

What Is TensorFlow?

If you are currently working on a machine learning project in Python, then you may have heard about this popular open-source library known as TensorFlow.

This library was developed by Google in collaboration with the Brain Team. TensorFlow is used in almost every Google application for machine learning.

TensorFlow works like a computational library for writing new algorithms that involve a large number of tensor operations. Since neural networks can be easily expressed as computational graphs, they can be implemented using TensorFlow as a series of operations on Tensors. Plus, tensors are N-dimensional matrices that represent your data.

Features of TensorFlow

TensorFlow is optimized for speed, and it makes use of techniques like XLA for quick linear algebra operations.

1. Responsive Construct

With TensorFlow, we can easily visualize each and every part of the graph, which is not an option while using Numpy or SciKit.

2. Flexible

One of the very important Tensorflow Features is that it is flexible in its operability, meaning it has modularity, and for the parts of it that you want to make stand alone, it offers you that option.

3. Easily Trainable

It is easily trainable on CPU as well as GPU for distributed computing.

4. Parallel Neural Network Training

TensorFlow offers pipelining, in the sense that you can train multiple neural networks and multiple GPUs, which makes the models very efficient on large-scale systems.

5. Large Community

Needless to say, if it has been developed by Google, there is already a large team of software engineers who work on stability improvements continuously.

6. Open Source

The best thing about this machine learning library is that it is open source, so anyone can use it as long as they have internet connectivity.

Where Is TensorFlow Used?

You are using TensorFlow daily but indirectly with applications like Google Voice Search or Google Photos. These applications are developed using this library.

All the libraries created in TensorFlow are written in C and C++. However, it has a complicated frontend for Python. Your Python code will get compiled and then executed on TensorFlow distributed execution engine built using C and C++.

The number of applications of TensorFlow is literally unlimited, and that is the beauty of TensorFlow.

Scikit-Learn

Top 10 Python Libraries - Edureka

What Is Scikit-learn?

It is a Python library is associated with NumPy and SciPy. It is considered one of the best libraries for working with complex data.

There are a lot of changes being made in this library. One modification is the cross-validation feature, providing the ability to use more than one metric. Lots of training methods like logistics regression and nearest neighbors have received some little improvements.

Features Of Scikit-Learn

1. Cross-validation: There are various methods to check the accuracy of supervised models on unseen data.

2.Unsupervised learning algorithms: Again, there is a large spread of algorithms in the offering — starting from clustering, factor analysis, and principal component analysis to unsupervised neural networks.

3. Feature extraction: Useful for extracting features from images and text (e.g. Bag of words

Where Is Scikit-Learn Used?

It contains a numerous number of algorithms for implementing standard machine learning and data mining tasks like reducing dimensionality, classification, regression, clustering, and model selection.

Numpy

Top 10 Python Libraries - Edureka

What Is Numpy?

Numpy is considered one of the most popular machine learning libraries in Python.

TensorFlow and other libraries use Numpy internally for performing multiple operations on Tensors. Array interface is the best and the most important feature of Numpy.

Features Of Numpy

  1. Interactive: Numpy is very interactive and easy to use
  2. Mathematics: Makes complex mathematical implementations very simple
  3. Intuitive: Makes coding real easy and grasping the concepts is easy
  4. Lots of Interaction: Widely used, hence a lot of open source contribution

Where Is Numpy Used?

This interface can be utilized for expressing images, sound waves, and other binary raw streams as an array of real numbers in N-dimensional.

For implementing this library for machine learning, having knowledge of Numpy is important for full-stack developers.

Keras

Top 10 Python Libraries - Edureka

What Is Keras?

Keras is considered one of the coolest machine learning libraries in Python. It provides an easier mechanism to express neural networks. Keras also provides some of the best utilities for compiling models, processing data-sets, visualization of graphs, and much more.

In the backend, Keras uses either Theano or TensorFlow internally. Some of the most popular neural networks like CNTK can also be used. Keras is comparatively slow when we compare it with other machine learning libraries because it creates a computational graph by using back-end infrastructure and then makes use of it to perform operations. All the models in Keras are portable.

Features Of Keras

  • It runs smoothly on both CPU and GPU.
  • Keras supports almost all the models of a neural network — fully connected, convolutional, pooling, recurrent, embedding, etc. Furthermore, these models can be combined to build more complex models.
  • Keras, being modular in nature, is incredibly expressive, flexible, and apt for innovative research.
  • Keras is a completely Python-based framework, which makes it easy to debug and explore.

Where Is Keras Used?

You are already constantly interacting with features built with Keras — it is in use at Netflix, Uber, Yelp, Instacart, Zocdoc, Square, and many others. It is especially popular among startups that place deep learning at the core of their products.

Keras contains numerous implementations of commonly used neural network building blocks such as layers, objectives, activation functions, optimizers and a host of tools to make working with image and text data easier.

Plus, it provides many pre-processed data-sets and pre-trained models like MNIST, VGG, Inception, SqueezeNet, ResNet, etc.

Keras is also a favorite among deep learning researchers, coming in at #2. Keras has also been adopted by researchers at large scientific organizations, in particular, CERN and NASA.

PyTorch

Top 10 Python Libraries - Edureka

What Is PyTorch?

PyTorch is the largest machine learning library that allows developers to perform tensor computations with the acceleration of GPU, creates dynamic computational graphs, and calculate gradients automatically. Other than this, PyTorch offers rich APIs for solving application issues related to neural networks.

This machine learning library is based on Torch, which is an open-source machine library implemented in C with a wrapper in Lua.

This machine library, in Python, was introduced in 2017, and since its inception, the library is gaining popularity and attracting an increasing number of machine learning developers.

Features Of PyTorch

Hybrid Front-End

A new hybrid frontend provides ease-of-use and flexibility in eager mode, while seamlessly transitioning to graph mode for speed, optimization, and functionality in C++ runtime environments.

Distributed Training

Optimize performance in both research and production by taking advantage of native support for asynchronous execution of collective operations and peer-to-peer communication that is accessible from Python and C++.

Python First

PyTorch is not a Python binding into a monolithic C++ framework. It’s built to be deeply integrated into Python so it can be used with popular libraries and packages such as Cython and Numba.

Libraries and Tools

An active community of researchers and developers have built a rich ecosystem of tools and libraries for extending PyTorch and supporting development in areas from computer vision to reinforcement learning.

Where Is PyTorch Used?

PyTorch is primarily used for applications such as natural language processing.

It is primarily developed by Facebook’s artificial-intelligence research group and Uber’s “Pyro” software for probabilistic programming is built on it.

PyTorch is outperforming TensorFlow in multiple ways and it is gaining a lot of attention in recent days.

LightGBM

Top 10 Python Libraries - Edureka

What Is LightGBM?

Gradient Boosting is one of the best and most popular machine learning(ML) library, which helps developers in building new algorithms by using redefined elementary models and namely decision trees. Therefore, there are special libraries that are designed for fast and efficient implementation of this method.

These libraries are LightGBM, XGBoost, and CatBoost. All these libraries are competitors that help in solving a common problem and can be utilized in almost a similar manner.

Features of LightGBM

Very fast computation ensures high production efficiency.

Intuitive, hence makes it user-friendly.

Faster training than many other deep learning libraries.

Will not produce errors when you consider NaN values and other canonical values.

Where Is LightGBM Used?

This library provides highly scalable, optimized, and fast implementations of gradient boosting, which makes it popular among machine learning developers. Because most of the machine learning full-stack developers won machine learning competitions by using these algorithms.

Eli5

Top 10 Python Libraries - Edureka

What Is Eli5?

Most often, the results of machine learning model predictions are not accurate, and Eli5 machine learning library built-in Python helps in overcoming this challenge. It is a combination of visualization and debugs all the machine learning models and tracks all working steps of an algorithm.

Features of Eli5

Moreover, Eli5 supports other libraries XGBoost, lightning, scikit-learn, and sklearn-crfsuite libraries. All the above-mentioned libraries can be used to perform different tasks using each one of them.

Where Is Eli5 Used?

  • Mathematical applications that require a lot of computation in a short time.
  • Eli5 plays a vital role where there are dependencies with other Python packages.
  • Legacy applications and implementing newer methodologies in various fields.

SciPy

Top 10 Python Libraries - Edureka

What Is SciPy?

SciPy is a machine learning library for application developers and engineers. However, you still need to know the difference between SciPy library and SciPy stack. SciPy library contains modules for optimization, linear algebra, integration, and statistics.

Features Of SciPy

The main feature of the SciPy library is that it is developed using NumPy, and its array makes the most use of NumPy.

In addition, SciPy provides all the efficient numerical routines like optimization, numerical integration, and many others using its specific submodules.

All the functions in all submodules of SciPy are well documented.

Where Is SciPy Used?

SciPy is a library that uses NumPy for the purpose of solving mathematical functions. SciPy uses NumPy arrays as the basic data structure and comes with modules for various commonly used tasks in scientific programming.

Tasks including linear algebra, integration (calculus), ordinary differential equation solving and signal processing are handled easily by SciPy.

Theano

Top 10 Python Libraries - Edureka

What Is Theano?

Theano is a computational framework machine learning library in Python for computing multidimensional arrays. Theano works similar to TensorFlow, but it not as efficient as TensorFlow. Because of its inability to fit into production environments.

Moreover, Theano can also be used on a distributed or parallel environments just similar to TensorFlow.

Features Of Theano

  • Tight integration with NumPy – Ability to use completely NumPy arrays in Theano-compiled functions.
  • Transparent use of a GPU – Perform data-intensive computations much faster than on a CPU.
  • Efficient symbolic differentiation – Theano does your derivatives for functions with one or many inputs.
  • Speed and stability optimizations – Get the right answer for log(1+x) even when x is very tiny. This is just one of the examples to show the stability of Theano.
  • Dynamic C code generation – Evaluate expressions faster than ever before, thereby increasing efficiency by a lot.
  • Extensive unit-testing and self-verification – Detect and diagnose multiple types of errors and ambiguities in the model.

Where Is Theano Used?

The actual syntax of Theano expressions is symbolic, which can be off-putting to beginners used to normal software development. Specifically, an expression is defined in the abstract sense, compiled, and later actually used to make calculations.

It was specifically designed to handle the types of computation required for large neural network algorithms used in Deep Learning. It was one of the first libraries of its kind (development started in 2007) and is considered an industry standard for Deep Learning research and development.

Theano is being used in multiple neural network projects today, and the popularity of Theano is only growing with time.

Pandas

Top 10 Python Libraries - Edureka

What Is Pandas?

Pandas is a machine learning library in Python that provides data structures of high-level and a wide variety of tools for analysis. One of the great features of this library is the ability to translate complex operations with data using one or two commands. Pandas has so many inbuilt methods for grouping, combining data, filtering, as well as time-series functionality.

All these are followed by outstanding speed indicators.

Features Of Pandas

Pandas makes sure that the entire process of manipulating data will be easier. Support for operations such as Re-indexing, Iteration, Sorting, Aggregations, Concatenations, and Visualizations are among the feature highlights of Pandas.

Where Is Pandas Used?

Currently, there are fewer releases of the Pandas library, which includes hundreds of new features, bug fixes, enhancements, and changes in API. The improvements in Pandas are its ability to group and sort data, select the best-suited output for the applied method, and provide support for performing custom types operations.

Data Analysis, among everything else, takes the highlight when it comes to using Pandas. But when used with other libraries and tools, Pandas ensures high functionality and a good amount of flexibility.

That’s it, folks! I hope this article helped you kickstart your learning the libraries available in Python.

18.4 systemd-journald.service 簡介

過去只有rsyslogd 的年代中,由於rsyslogd 必須要開機完成並且執行了rsyslogd 這個daemon 之後,登錄文件才會開始記錄。所以,核心還得要自己產生一個klogd 的服務, 才能將系統在開機過程、啟動服務的過程中的信息記錄下來,然後等rsyslogd 啟動後才傳送給它來處理~

現在有了systemd 之後,由於這玩意兒是核心喚醒的,然後又是第一支執行的軟件,它可以主動調用systemd-journald 來協助記載登錄文件~ 因此在開機過程中的所有信息,包括啟動服務與服務若啟動失敗的情況等等,都可以直接被記錄到systemd-journald 裡頭去!

不過systemd-journald 由於是使用於內存的登錄文件記錄方式,因此重新開機過後,開機前的登錄文件信息當然就不會被記載了。為此,我們還是建議啟動rsyslogd 來協助分類記錄!也就是說, systemd-journald 用來管理與查詢這次開機後的登錄信息,而rsyslogd 可以用來記錄以前及現在的所以數據到磁盤文件中,方便未來進行查詢喔!

 

Tips雖然systemd-journald所記錄的數據其實是在內存中,但是系統還是利用文件的型態將它記錄到/run/log/下面!不過我們從前面幾章也知道, /run在CentOS 7其實是內存內的數據,所以重新開機過後,這個/run/log下面的數據當然就被刷新,舊的當然就不再存在了!

18.4.1 使用journalctl 觀察登錄信息

那麼systemd-journald.service 的數據要如何叫出來查閱呢?很簡單!就通過journalctl 即可!讓我們來瞧瞧這個指令可以做些什麼事?

[root@study ~]# journalctl [-nrpf] [--since TIME] [--until TIME] _optional
选项与参数:
默认会秀出全部的 log 内容,从旧的输出到最新的讯息
-n  :秀出最近的几行的意思~找最新的信息相当有用
-r  :反向输出,从最新的输出到最旧的数据
-p  :秀出后面所接的讯息重要性排序!请参考前一小节的 rsyslogd 信息
-f  :类似 tail -f 的功能,持续显示 journal 日志的内容(实时监测时相当有帮助!)
--since --until:设置开始与结束的时间,让在该期间的数据输出而已
_SYSTEMD_UNIT=unit.service :只输出 unit.service 的信息而已
_COMM=bash :只输出与 bash 有关的信息
_PID=pid   :只输出 PID 号码的信息
_UID=uid   :只输出 UID 为 uid 的信息
SYSLOG_FACILITY=[0-23] :使用 syslog.h 规范的服务相对序号来调用出正确的数据!

范例一:秀出目前系统中所有的 journal 日志数据
[root@study ~]# journalctl
-- Logs begin at Mon 2015-08-17 18:37:52 CST, end at Wed 2015-08-19 00:01:01 CST. --
Aug 17 18:37:52 study.centos.vbird systemd-journal[105]: Runtime journal is using 8.0M (max 
 142.4M, leaving 213.6M of free 1.3G, current limit 142.4M).
Aug 17 18:37:52 study.centos.vbird systemd-journal[105]: Runtime journal is using 8.0M (max
 142.4M, leaving 213.6M of free 1.3G, current limit 142.4M).
Aug 17 18:37:52 study.centos.vbird kernel: Initializing cgroup subsys cpuset
Aug 17 18:37:52 study.centos.vbird kernel: Initializing cgroup subsys cpu
.....(中间省略).....
Aug 19 00:01:01 study.centos.vbird run-parts(/etc/cron.hourly)[19268]: finished 0anacron
Aug 19 00:01:01 study.centos.vbird run-parts(/etc/cron.hourly)[19270]: starting 0yum-hourly.cron
Aug 19 00:01:01 study.centos.vbird run-parts(/etc/cron.hourly)[19274]: finished 0yum-hourly.cron
# 从这次开机以来的所有数据都会显示出来!通过 less 一页页翻动给管理员查阅!数据量相当大!

范例二:(1)仅显示出 2015/08/18 整天以及(2)仅今天及(3)仅昨天的日志数据内容
[root@study ~]# journalctl --since "2015-08-18 00:00:00" --until "2015-08-19 00:00:00"
[root@study ~]# journalctl --since today
[root@study ~]# journalctl --since yesterday --until today

范例三:只找出 crond.service 的数据,同时只列出最新的 10 笔即可
[root@study ~]# journalctl _SYSTEMD_UNIT=crond.service -n 10

范例四:找出 su, login 执行的登录文件,同时只列出最新的 10 笔即可
[root@study ~]# journalctl _COMM=su _COMM=login -n 10

范例五:找出讯息严重等级为错误 (error) 的讯息!
[root@study ~]# journalctl -p err

范例六:找出跟登录服务 (auth, authpriv) 有关的登录文件讯息
[root@study ~]# journalctl SYSLOG_FACILITY=4 SYSLOG_FACILITY=10
# 更多关于 syslog_facility 的数据,请参考 18.2.1 小节的内容啰!

基本上,有journalctl 就真的可以搞定你的訊息數據囉!全部的數據都在這裡面耶~再來假設一下,你想要了解到登錄文件的實時變化, 那又該如何處置呢?現在,請開兩個終端機,讓我們來處理處理!

# 第一号终端机,请使用下面的方式持续侦测系统!
[root@study ~]# journalctl -f
# 这时系统会好像卡住~其实不是卡住啦!是类似 tail -f 在持续的显示登录文件信息的!

# 第二号终端机,使用下面的方式随便发一封 email 给系统上的帐号!
[root@study ~]# echo "testing" &#124; mail -s 'tset' dmtsai
# 这时,你会发现到第一号终端机竟然一直输出一些讯息吧!没错!这就对了!

如果你有一些必須要偵測的行為,可以使用這種方式來實時了解到系統出現的訊息~而取消journalctl -f 的方法,就是[crtl]+c 啊!

18.4.2 logger 指令的應用

上面談到的是叫出登錄文件給我們查閱,那換個角度想,“如果你想要讓你的數據儲存到登錄文件當中”呢?那該如何是好?這時就得要使用logger 這個好用的傢伙了!這個傢伙可以傳輸很多信息,不過,我們只使用最簡單的本機信息傳遞~ 更多的用法就請您自行man logger 囉!

[root@study ~]# logger [-p 服务名称.等级] "讯息"
选项与参数:
服务名称.等级 :这个项目请参考 rsyslogd 的本章后续小节的介绍;

范例一:指定一下,让 dmtsai 使用 logger 来传送数据到登录文件内
[root@study ~]# logger -p user.info "I will check logger command"
[root@study ~]# journalctl SYSLOG_FACILITY=1 -n 3
-- Logs begin at Mon 2015-08-17 18:37:52 CST, end at Wed 2015-08-19 18:03:17 CST. --
Aug 19 18:01:01 study.centos.vbird run-parts(/etc/cron.hourly)[29710]: starting 0yum-hourly.cron
Aug 19 18:01:01 study.centos.vbird run-parts(/etc/cron.hourly)[29714]: finished 0yum-hourly.cron
Aug 19 18:03:17 study.centos.vbird dmtsai[29753]: I will check logger command

現在,讓我們來瞧一瞧,如果我們之前寫的backup.service 服務中,如果使用手動的方式來備份,亦即是使用”/backups/backup.sh log” 來執行備份時, 那麼就通過logger來記錄備份的開始與結束的時間!該如何是好呢?這樣作看看!

[root@study ~]# vim /backups/backup.sh
#!/bin/bash

if [ "${1}" == "log" ]; then
        logger -p syslog.info "backup.sh is starting"
fi
source="/etc /home /root /var/lib /var/spool/{cron,at,mail}"
target="/backups/backup-system-$(date +%Y-%m-%d).tar.gz"
[ ! -d /backups ] && mkdir /backups
tar -zcvf ${target} ${source} &&gt; /backups/backup.log
if [ "${1}" == "log" ]; then
        logger -p syslog.info "backup.sh is finished"
fi

[root@study ~]# /backups/backup.sh log
[root@study ~]# journalctl SYSLOG_FACILITY=5 -n 3
Aug 19 18:09:37 study.centos.vbird dmtsai[29850]: backup.sh is starting
Aug 19 18:09:54 study.centos.vbird dmtsai[29855]: backup.sh is finished

通過這個玩意兒,我們也能夠將數據自行處置到登錄文件當中囉!

18.4.3 保存journal 的方式

再強調一次,這個systemd-journald.servicd 的訊息是不會放到下一次開機後的,所以,重新開機後,那之前的記錄通通會遺失。雖然我們大概都有啟動rsyslogd 這個服務來進行後續的登錄文件放置,不過如果你比較喜歡journalctl 的存取方式,那麼可以將這些數據儲存下來喔!

基本上,systemd-journald.service 的配置文件主要參考/etc/systemd/journald.conf 的內容,詳細的參數你可以參考man 5 journald.conf 的數據。因為默認的情況下面,配置文件的內容應該已經符合我們的需求,所以這邊鳥哥就不再修改配置文件了。只是如果想要保存你的journalctl 所讀取的登錄文件, 那麼就得要創建一個/var/log/journal 的目錄,並且處理一下該目錄的權限,那麼未來重新啟動systemd-journald.service 之後, 日誌登錄文件就會主動的複制一份到/var/log/journal 目錄下囉!

# 1\. 先处理所需要的目录与相关权限设置
[root@study ~]# mkdir /var/log/journal
[root@study ~]# chown root:systemd-journal /var/log/journal
[root@study ~]# chmod 2775 /var/log/journal

# 2\. 重新启动 systemd-journald 并且观察备份的日志数据!
[root@study ~]# systemctl restart systemd-journald.service
[root@study ~]# ll /var/log/journal/
drwxr-sr-x. 2 root systemd-journal 27 Aug 20 02:37 309eb890d09f440681f596543d95ec7a

你得要注意的是,因為現在整個日誌登錄文件的容量會持續長大,因此你最好還是觀察一下你係統能用的總容量喔!避免不小心文件系統的容量被灌爆!此外,未來在/run/log 下面就沒有相關的日誌可以觀察了!因為移動到/var/log/journal 下面來囉!

其實鳥哥是這樣想的,既然我們還有rsyslog.service 以及logrotate 的存在,因此這個systemd-journald.service 產生的登錄文件, 個人建議最好還是放置到/run/log 的內存當中,以加快存取的速度!而既然rsyslog.service 可以存放我們的登錄文件, 似乎也沒有必要再保存一份journal 登錄文件到系統當中就是了。單純的建議!如何處理,依照您的需求即可喔!

Go 語言很好很強大,但我有幾個問題想吐槽

Go 是一門非常不錯的編程語言。然而,我在公司的Slack 編程頻道中對Go 的抱怨卻越來越多(猜到我是做啥了的吧?),因此我認為有必要把這些吐槽寫下來並放在這裡,這樣當人們問我抱怨什麼時,我給他們一個鏈接就行了。

image

先聲明一下,在過去的一年裡,我大量地使用Go語言開發命令行應用程序、scclc和API。其中既有供客戶端調用的大規模API,也有即將在https://searchcode.com/使用的語法高亮顯示器

我這些批評全部是針對Go 語言的。但是,我對使用過的每種語言都有不滿。我非常贊同下面的話:

“世界上只有兩種語言:人們抱怨的語言和沒人使用的語言。” —— Bjarne Stroustrup

1 不支持函數式編程

我並不是一個函數式編程狂熱者。說到Lisp 語言,我首先想到的是語言障礙。

這可能是Go 語言最大的痛點了。與大部分人不同,我不希望Go 支持泛型,因為它會為多數Go 項目帶來不必要的複雜性。我希望Go 語言支持適用於內置切片和Map 的函數式方法。切片和Map 具有通用性,並且可以容納任何類型,從這個意義上講,它們已經非常神奇。在Go 語言中只有利用接口才能實現類似效果,但這樣一來將喪失安全性和速度。

例如,請考慮下面的問題。

給定兩個字符串切片,找出二者都包含的字符串,並將其放入新的切片以備後用。

複製代碼

existsBoth := []string{}
for _, first := range firstSlice {
for _, second := range secondSlice {
if first == second {
existsBoth = append(existsBoth, proxy)
break
}
}
}

上面是一個用Go 語言實現的簡單方案。當然還有其它方法,比如借助Map 來減少運行時間。這裡我們假設內存足夠用或者切片都不太大,同時假設優化運行時間帶來的複雜性遠超收益,因此不值得優化。作為對比,使用Java 流和函數式編程把相同的邏輯重寫如下:

複製代碼

var existsBoth = firstList.stream()
.filter(x -> secondList.contains(x))
.collect(Collectors.toList());

上面的代碼隱藏了算法的複雜性,但是,你更容易理解它實際做的事情。

與Go 代碼相比,Java 代碼的意圖一目了然。真正靈活之處在於,添加更多的過濾條件易如反掌。如果使用Go 語言添加下面例子中的過濾條件,我們需要在嵌套的for 循環中再添加兩個if 條件。

複製代碼

var existsBoth = firstList.stream()
.filter(x -> secondList.contains(x))
.filter(x -> x.startsWith(needle))
.filter(x -> x.length() >= 5)
.collect(Collectors.toList());

有些借助go generate 命令的項目可以幫你實現上面的一些功能。但是,如果缺少良好的IDE 支持,抽取循環中的語句作為單獨的方法是一件低效又麻煩的事情。

2 通道/ 並行切片處理

Go 通道通常都很好用。但它並不能提供無限的並發能力。它確實存在一些會導致永久阻塞的問題,但這些問題用競爭檢測器能很容易地解決。對於數量不確定或不知何時結束的流式數據,以及非CPU 密集型的數據處理方法,Go 通道都是很好的選擇。

Go 通道不太適合併行處理大小已知的切片。

多線程編程、理論和實踐

image

幾乎在其它任何語言中,當列表或切片很大時,為了充分利用所有CPU 內核,通常都會使用並行流、並行Linq、Rayon、多處理或其它語法來遍歷列表。遍歷後的返回值是一個包含已處理元素的列表。如果元素足夠多,或者處理元素的函數足夠複雜,多核系統會更高效。

但是在Go 語言中,實現高效處理所需要做的事情卻並不顯而易見。

一種可能的解決方案是為切片中的每個元素都創建一個Go 例程。由於Go 例程的開銷很低,因此從某種程度上來說這是一個有效的策略。

複製代碼

toProcess := []int{1,2,3,4,5,6,7,8,9}
var wg sync.WaitGroup
for i, _ := range toProcess {
wg.Add(1)
go func(j int) {
toProcess[j] = someSlowCalculation(toProcess[j])
wg.Done()
}(i)
}
wg.Wait()
fmt.Println(toProcess)

上面的代碼會保持切片中元素的順序,但我們假設不必保持元素順序。

這段代碼的第一個問題是增加了一個WaitGroup,並且必須要記得調用它的Add 和Done 方法。這增加了開發人員的工作量。如果弄錯了,這個程序不會產生正確的輸出,結果是要么輸出不確定,要么程序永不結束。此外,如果列表很長,你會為每個列表創建一個Go 例程。正如我之前所說,這不是問題,因為Go 能輕鬆搞定。問題在於,每個Go 例程都會爭搶CPU 時間片。因此,這不是執行該任務的最有效方式。

你可能希望為每個CPU內核創建一個Go例程,並讓這些例程選取列表並處理。創建Go例程的開銷很小,但是在一個非常緊湊的循環中創建它們會使開銷陡增。當我開發scc時就遇到了這種情況,因此我採用了每個CPU內核對應一個Go例程的策略。在Go語言中,要這樣做的話,你首先要創建一個通道,然後遍歷切片中的元素,使函數從該通道讀取數據,之後從另一個通道讀取。我們來看一下。

複製代碼

toProcess := []int{1,2,3,4,5,6,7,8,9}
var input = make(chan int, len(toProcess))
for i, _ := range toProcess {
input <- i
}
close(input)
var wg sync.WaitGroup
for i := 0; i < runtime.NumCPU(); i++ {
wg.Add(1)
go func(input chan int, output []int) {
for j := range input {
toProcess[j] = someSlowCalculation(toProcess[j])
}
wg.Done()
}(input, toProcess)
}
wg.Wait()
fmt.Println(toProcess)

上面的代碼創建了一個通道,然後遍歷切片,將索引值放入通道。接下來我們為每個CPU 內核創建一個Go 例程,操作系統會報告並處理相應的輸入,然後等待,直到所有操作完成。這裡有很多代碼需要理解。

然而,這種實現有待商榷。如果切片非常大,通道的緩衝區長度和切片大小相同,你可能不希望創建一個有這麼大緩衝區的通道。因此,你應該創建另一個Go 例程來遍歷切片,並將切片中的值放入通道,完成後關閉通道。但這樣一來代碼會變得冗長,因此我把它去掉了。我希望可以大概地闡明基本思路。

使用Java 語言大致這樣實現:

複製代碼

var firstList = List.of(1,2,3,4,5,6,7,8,9);
firstList = firstList.parallelStream()
.map(this::someSlowCalculation)
.collect(Collectors.toList());

通道和流並不等價。使用隊列去仿寫Go 代碼的邏輯更好一些,因為它們更具有可比性,但我們的目的不是進行1 對1 的比較。我們的目標是充分利用所有的CPU 內核處理切片或列表。

如果someSlowCalucation 方法調用了網絡或其它非CPU 密集型任務,這當然不是問題。在這種情況下,通道和Go 例程都會表現得很好。

這個問題與問題#1 有關。如果Go 語言支持適用於切片/Map 對象的函數式方法,那麼就能實現這個功能。但是,如果Go 語言支持泛型,有人就可以把上面的功能封裝成像Rust 的Rayon 一樣的庫,讓每個人都從中受益,這就很令人討厭了(我不希望Go 支持泛型)。

順便說一下,我認為這個缺陷妨礙了Go 語言在數據科學領域的成功,這也是為什麼Python 仍然是數據科學領域的王者。Go 語言在數值操作方面缺乏表現力和能力,原因就是以上討論的這些。

3 垃圾回收器

Go 的垃圾回收器做得非常不錯。我開發的應用程序通常都會因為新版本的改進而變得更快。但是,它以低延遲為最高優先級。對於API 和UI 應用來說,這個選擇完全可以接受。對於包含網絡調用的應用,因為網絡調用往往會是瓶頸,所以它也沒問題。

我發現的問題是Go對UI應用來講一點也不好(我不知道它有任何良好的支持)。如果你想要盡可能高的吞吐量,那這個選擇會讓你很受傷。這是我開發scc時遇到的一個主要問題。scc是一個CPU密集型的命令行工具。為了解決這個問題,我不得不在代碼裡添加邏輯關閉GC,直到達到某個閾值。但是我又不能簡單的禁用它,因為有些任務會很快耗盡內存。

缺乏對GC 的控制時常令人沮喪。你得學會適應它,但是,有時候如果能做到這樣該有多好:“嘿,這些代碼確實需要盡可能快地運行,所以如果你能在高吞吐模式運行一會,那就太好了。”

image

我認為這種情況在Go 1.12 版本中有所改善,因為GC 得到了進一步的改進。但僅僅是關閉和打開GC 還不夠,我期望更多的控制。如果有時間我會再進行研究。

4 錯誤處理

我並不是唯一一個抱怨這個問題的人,但我不吐不快。

複製代碼

value, err := someFunc()
if err != nil {
// Do something here
}
err = someOtherFunc(value)
if err != nil {
// Do something here
}

上面的代碼很乏味。Go 甚至不會像有些人建議的那樣強制你處理錯誤。你可以使用“_”顯式忽略它(這是否算作對它進行了處理呢?),你還可以完全忽略它。比如上面的代碼可以重寫為:

複製代碼

value, _ := someFunc()
someOtherFunc(value)

很顯然,我顯式忽略了someFunc 方法的返回。someOtherFunc(value)方法也可能返回錯誤值,但我完全忽略了它。這裡的錯誤都沒有得到處理。

說實話,我不知道如何解決這個問題。我喜歡Rust中的“?”運算符,它可以幫助避免這種情況。V-Lang https://vlang.io/看起來也可能有一些有趣的解決方案。

另一個辦法是使用可選類型(Optional types)並去掉nil,但這不會發生在Go 語言裡,即使是Go 2.0 版本,因為它會破壞向後兼容性。

結語

Go 仍然是一種非常不錯的語言。如果你讓我寫一個API,或者完成某個需要大量磁盤/ 網絡調用的任務,它依然是我的首選。現在我會用Go 而非Python 去完成很多一次性任務,數據合併任務是例外,因為函數式編程的缺失使執行效率難以達到要求。

與Java 不同,Go 語言盡量遵循“最小驚喜“原則。比如可以這樣比較字兩個符串是否相等:stringA == stringB。但如果你這樣比較兩個切片,那麼會產生編譯錯誤。這些都是很好的特性。

的確,二進製文件還可以變的更小(一些編譯標誌和upx可以解決這個問題),我希望它在某些方面變得更快,GOPATH雖然不是很好,但也沒有人們想得那麼糟糕,默認的單元測試框架缺少很多功能,模擬(mocking)有點讓人痛苦…

它仍然是我使用過的效率較高的語言之一。我會繼續使用它,雖然我希望https://vlang.io/能最終發布,並解決我的很多抱怨。V語言或Go 2.0,Nim或Rust。現在有很多很酷的新語言可以使用,我們開發人員真的要被寵壞了。

查看英文原文:

https://boyter.org/posts/my-personal-complaints-about-golang/

IDEA 2019.02.07注册码

N757JE0KCT-eyJsaWNlbnNlSWQiOiJONzU3SkUwS0NUIiwibGljZW5zZWVOYW1lIjoid3UgYW5qdW4iLCJhc3NpZ25lZU5hbWUiOiIiLCJhc3NpZ25lZUVtYWlsIjoiIiwibGljZW5zZVJlc3RyaWN0aW9uIjoiRm9yIGVkdWNhdGlvbmFsIHVzZSBvbmx5IiwiY2hlY2tDb25jdXJyZW50VXNlIjpmYWxzZSwicHJvZHVjdHMiOlt7ImNvZGUiOiJJSSIsInBhaWRVcFRvIjoiMjAyMC0wMS0wNyJ9LHsiY29kZSI6IkFDIiwicGFpZFVwVG8iOiIyMDIwLTAxLTA3In0seyJjb2RlIjoiRFBOIiwicGFpZFVwVG8iOiIyMDIwLTAxLTA3In0seyJjb2RlIjoiUFMiLCJwYWlkVXBUbyI6IjIwMjAtMDEtMDcifSx7ImNvZGUiOiJHTyIsInBhaWRVcFRvIjoiMjAyMC0wMS0wNyJ9LHsiY29kZSI6IkRNIiwicGFpZFVwVG8iOiIyMDIwLTAxLTA3In0seyJjb2RlIjoiQ0wiLCJwYWlkVXBUbyI6IjIwMjAtMDEtMDcifSx7ImNvZGUiOiJSUzAiLCJwYWlkVXBUbyI6IjIwMjAtMDEtMDcifSx7ImNvZGUiOiJSQyIsInBhaWRVcFRvIjoiMjAyMC0wMS0wNyJ9LHsiY29kZSI6IlJEIiwicGFpZFVwVG8iOiIyMDIwLTAxLTA3In0seyJjb2RlIjoiUEMiLCJwYWlkVXBUbyI6IjIwMjAtMDEtMDcifSx7ImNvZGUiOiJSTSIsInBhaWRVcFRvIjoiMjAyMC0wMS0wNyJ9LHsiY29kZSI6IldTIiwicGFpZFVwVG8iOiIyMDIwLTAxLTA3In0seyJjb2RlIjoiREIiLCJwYWlkVXBUbyI6IjIwMjAtMDEtMDcifSx7ImNvZGUiOiJEQyIsInBhaWRVcFRvIjoiMjAyMC0wMS0wNyJ9LHsiY29kZSI6IlJTVSIsInBhaWRVcFRvIjoiMjAyMC0wMS0wNyJ9XSwiaGFzaCI6IjExNTE5OTc4LzAiLCJncmFjZVBlcmlvZERheXMiOjAsImF1dG9Qcm9sb25nYXRlZCI6ZmFsc2UsImlzQXV0b1Byb2xvbmdhdGVkIjpmYWxzZX0=-AE3x5sRpDellY4SmQVy2Pfc2IT7y1JjZFmDA5JtOv4K5gwVdJOLw5YGiOskZTuGu6JhOi50nnd0WaaNZIuVVVx3T5MlXrAuO3kb2qPtLtQ6/n3lp4fIv+6384D4ciEyRWijG7NA9exQx39Tjk7/xqaGk7ooKgq5yquIfIA+r4jlbW8j9gas1qy3uTGUuZQiPB4lv3P5OIpZzIoWXnFwWhy7s//mjOWRZdf/Du3RP518tMk74wizbTeDn84qxbM+giNAn+ovKQRMYHtLyxntBiP5ByzfAA9Baa5TUGW5wDiZrxFuvBAWTbLrRI0Kd7Nb/tB9n1V9uluB2WWIm7iMxDg==-MIIElTCCAn2gAwIBAgIBCTANBgkqhkiG9w0BAQsFADAYMRYwFAYDVQQDDA1KZXRQcm9maWxlIENBMB4XDTE4MTEwMTEyMjk0NloXDTIwMTEwMjEyMjk0NlowaDELMAkGA1UEBhMCQ1oxDjAMBgNVBAgMBU51c2xlMQ8wDQYDVQQHDAZQcmFndWUxGTAXBgNVBAoMEEpldEJyYWlucyBzLnIuby4xHTAbBgNVBAMMFHByb2QzeS1mcm9tLTIwMTgxMTAxMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxcQkq+zdxlR2mmRYBPzGbUNdMN6OaXiXzxIWtMEkrJMO/5oUfQJbLLuMSMK0QHFmaI37WShyxZcfRCidwXjot4zmNBKnlyHodDij/78TmVqFl8nOeD5+07B8VEaIu7c3E1N+e1doC6wht4I4+IEmtsPAdoaj5WCQVQbrI8KeT8M9VcBIWX7fD0fhexfg3ZRt0xqwMcXGNp3DdJHiO0rCdU+Itv7EmtnSVq9jBG1usMSFvMowR25mju2JcPFp1+I4ZI+FqgR8gyG8oiNDyNEoAbsR3lOpI7grUYSvkB/xVy/VoklPCK2h0f0GJxFjnye8NT1PAywoyl7RmiAVRE/EKwIDAQABo4GZMIGWMAkGA1UdEwQCMAAwHQYDVR0OBBYEFGEpG9oZGcfLMGNBkY7SgHiMGgTcMEgGA1UdIwRBMD+AFKOetkhnQhI2Qb1t4Lm0oFKLl/GzoRykGjAYMRYwFAYDVQQDDA1KZXRQcm9maWxlIENBggkA0myxg7KDeeEwEwYDVR0lBAwwCgYIKwYBBQUHAwEwCwYDVR0PBAQDAgWgMA0GCSqGSIb3DQEBCwUAA4ICAQAF8uc+YJOHHwOFcPzmbjcxNDuGoOUIP+2h1R75Lecswb7ru2LWWSUMtXVKQzChLNPn/72W0k+oI056tgiwuG7M49LXp4zQVlQnFmWU1wwGvVhq5R63Rpjx1zjGUhcXgayu7+9zMUW596Lbomsg8qVve6euqsrFicYkIIuUu4zYPndJwfe0YkS5nY72SHnNdbPhEnN8wcB2Kz+OIG0lih3yz5EqFhld03bGp222ZQCIghCTVL6QBNadGsiN/lWLl4JdR3lJkZzlpFdiHijoVRdWeSWqM4y0t23c92HXKrgppoSV18XMxrWVdoSM3nuMHwxGhFyde05OdDtLpCv+jlWf5REAHHA201pAU6bJSZINyHDUTB+Beo28rRXSwSh3OUIvYwKNVeoBY+KwOJ7WnuTCUq1meE6GkKc4D/cXmgpOyW/1SmBz3XjVIi/zprZ0zf3qH5mkphtg6ksjKgKjmx1cXfZAAX6wcDBNaCL+Ortep1Dh8xDUbqbBVNBL4jbiL3i3xsfNiyJgaZ5sX7i8tmStEpLbPwvHcByuf59qJhV/bZOl8KqJBETCDJcY6O2aqhTUy+9x93ThKs1GKrRPePrWPluud7ttlgtRveit/pcBrnQcXOl1rHq7ByB8CFAxNotRUYL9IF5n3wJOgkPojMy6jetQA5Ogc8Sm7RG6vg1yow==

本博客Nginx 配置之安全篇

之前有細心的朋友問我,為什麼你的博客副標題是「專注WEB 端開發」,是不是少了「前端」的「前」。我想說的是,儘管我從畢業到現在七年左右的時間一直都在專業前端團隊從事前端相關工作,但這並不意味著我的知識體係就必須局限於前端這個範疇內。現在比較流行「全棧工程師」的概念,我覺得全棧意味著一個項目中,各個崗位所需要的技能你都具備,但並不一定意味著你什麼都需要做。你需要做什麼,更多是由能力、人員配比以及成本等各個因素所決定。儘管我現在的工作職責是在WEB 前端領域,但是我的關注點在整個WEB 端。

我接觸過的有些前端朋友,從一開始就把自己局限在一個很小的範圍之中,這在大公司到也無所謂,大公司分工明確,基礎設施齊全,你只要做好自己擅長的那部分就可以了。但是當他們進入創業公司之後,會發現一下子來了好多之前完全沒有接觸過的東西,十分被動。

去年我用Lua + OpenResty替換了線上千萬級的PHP + Nginx服務,至今穩定運行,算是前端之外的一點嘗試。我一直認為學習任何知識很重要的一點是實踐,所以我一直都在折騰我的VPS,進行各種WEB安全、優化相關的嘗試。我打算從安全和性能兩方面介紹一下本博客所用Nginx的相關配置,今天先寫安全相關的。

隱藏不必要的信息

大家可以看一下我的博客請求響應頭,有這麼一行server: nginx,說明我用的是Nginx服務器,但並沒有具體的版本號。由於某些Nginx漏洞只存在於特定的版本,隱藏版本號可以提高安全性。這只需要在配置裡加上這個就可以了:

server_tokens   off;

如果想要更徹底隱藏所用Web Server,可以修改Nginx源碼,把Server Name改掉再編譯,具體步驟可以自己搜索。需要提醒的是:如果你的網站支持SPDY,只改動網上那些文章寫到的地方還不夠,跟SPDY有關的代碼也要改。更簡單的做法是改用Tengine這個Nginx的增強版,並指定server_tag為off或者任何想要的值就可以了。另外,既然想要徹底隱藏Nginx,404、500等各種出錯頁也需要自定義。

同樣,一些WEB語言或框架默認輸出的x-powered-by也會洩露網站信息,他們一般都提供了修改或移除的方法,可以自行查看手冊。如果部署上用到了Nginx的反向代理,也可以通過proxy_hide_header指令隱藏它:

proxy_hide_header        X-Powered-By;

禁用非必要的方法

由於我的博客只處理了GET、POST 兩種請求方法,而HTTP/1 協議還規定了TRACE 這樣的方法用於網絡診斷,這也可能會暴露一些信息。所以我針對GET、POST 以及HEAD 之外的請求,直接返回了444 狀態碼(444 是Nginx 定義的響應狀態碼,會立即斷開連接,沒有響應正文)。具體配置是這樣的:

NGINXif ($request_method !~ ^(GET|HEAD|POST)$ ) {
    return    444;
}

合理配置響應頭

我的博客是由自己用ThinkJS 寫的Node 程序提供服務,Nginx 通過proxy_pass 把請求反向代理給Node 綁定的IP 和端口。在最終輸出時,我給響應增加了以下頭部:

NGINXadd_header  Strict-Transport-Security  "max-age=31536000";
add_header  X-Frame-Options  deny;
add_header  X-Content-Type-Options  nosniff;
add_header  Content-Security-Policy  "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval' https://a.disquscdn.com; img-src 'self' data: https://www.google-analytics.com; style-src 'self' 'unsafe-inline'; frame-src https://disqus.com";

Strict-Transport-Security(簡稱為HSTS)可以告訴瀏覽器,在指定的max-age內,始終通過HTTPS訪問我的博客。即使用戶自己輸入HTTP的地址,或者點擊了HTTP鏈接,瀏覽器也會在本地替換為HTTPS再發送請求。另外由於我的證書不支持多域名,我沒有加上includeSubDomains。關於HSTS更多信息,可以查看我之前的介紹

X-Frame-Options用來指定此網頁是否允許被iframe嵌套,deny就是不允許任何嵌套發生。關於這個響應頭的更多介紹可以看這裡

X-Content-Type-Options用來指定瀏覽器對未指定或錯誤指定Content-Type資源真正類型的猜測行為,nosniff表示不允許任何猜測。這部分內容更多介紹見這裡

Content-Security-Policy(簡稱為CSP)用來指定頁面可以加載哪些資源,主要目的是減少XSS的發生。我允許了來自本站、disquscdn的外鏈JS,還允許內聯JS,以及在JS中使用eval;允許來自本站和google統計的圖片,以及內聯圖片(Data URI形式);允許本站外鏈CSS以及內聯CSS;允許iframe加載來自disqus的頁面。對於其他未指定的資源,都會走默認規則self,也就是只允許加載本站的。關於CSP的詳細介紹請看這裡

之前的博客中,我還介紹過X-XSS-Protection這個響應頭,也可以用來防範XSS。不過由於有了CSP,所以我沒配置它。

需要注意的是,以上這些響應頭現代瀏覽器才支持,所以並不是說加上他們,網站就可以不管XSS,萬事大吉了。但是鑑於低廉的成本,還是都配上。

HTTPS 安全配置

啟用HTTPS 並正確配置了證書,意味著數據傳輸過程中無法被第三者解密或修改。有了HTTPS,也得合理配置好Web Server,才能發揮最大價值。我的博客關於HTTPS 這一塊有以下配置:

NGINXssl_certificate      /home/jerry/ssl/server.crt;
ssl_certificate_key  /home/jerry/ssl/server.key;
ssl_dhparam          /home/jerry/ssl/dhparams.pem;

ssl_ciphers          ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:AES128-GCM-SHA256:AES256-GCM-SHA384:DES-CBC3-SHA;

ssl_prefer_server_ciphers  on;

ssl_protocols        TLSv1 TLSv1.1 TLSv1.2;

最終效果是我的博客在ssllabs的測試中達到了A+,如下圖:

ssllabs test

如何配置ssl_ciphers可以參考這個網站。需要注意的是,這個網站默認提供的加密方式安全性較高,一些低版本客戶端並不支持,例如IE9-、Android2.2-和Java6-。如果需要支持這些老舊的客戶端,需要點一下網站上的「Yes, give me a ciphersuite that works with legacy / old software」鏈接。

另外,我在ssl_ciphers最開始加上了CHACHA20,這是因為我的Nginx支持了CHACHA20_POLY1305加密算法,這是由Google開發的新一代加密方式,它有兩方面優勢:更好的安全性和更好的性能(尤其是在移動和可穿戴設備上)。下面有一張移動平台上它與AES-GCM的加密速度對比圖(via):

chacha20 poly1305

啟用CHACHA20_POLY1305最簡單的方法是在編譯Nginx時,使用LibreSSL代替OpenSSL。下面是用Chrome訪問我的博客時,點擊地址欄小鎖顯示的信息,可以看到加密方式使用的就是CHACHA20_POLY1305:

imququ.com

關於CHACHA20_POLY1305安全性和性能的詳細介紹可以查看本文

補充:使用CHACHA20_POLY1305的最佳實踐是「僅針對不支持AES-NI的終端使用CHACHA20算法,否則使用AES-GCM」。關於這個話題的詳細解釋和配置方法,請參考我的這篇文章:使用BoringSSL優化HTTPS加密算法選擇

關於ssl_dhparam的配置,可以參考這篇文章:Guide to Deploying Diffie-Hellman for TLS

SSLv3已被證實不安全,所以在ssl_protocols指令中,我並沒有包含它。

ssl_prefer_server_ciphers配置為on,可以確保在TLSv1握手時,使用服務端的配置項,以增強安全性。

好了,本文先就這樣,後面再寫跟性能有關的配置。

一天学会PostgreSQL应用开发与管理 – 1 如何搭建一套学习、开发PostgreSQL的环境

背景

万事开头难,搭建好一套学习、开发PostgreSQL的环境,是重中之重。

因为其他平台(Ubuntu, CentOS, MAC)的用户大多数都具备了自行安装数据库的能力,在这里我只写一个面向Windows用户的学习环境搭建文档。

分为三个部分,用户可以自由选择。

如果你想深入的学习PostgreSQL,建议搭建PostgreSQL on Linux的环境。如果你只是想将数据库使用在日常的应用开发工作中,有也不需要PG的其他附加插件的功能,那么你可以选择PostgreSQL on Win的环境搭建。

如果你不想搭建本地的PostgreSQL,那么你可以使用云数据库服务,比如阿里云RDS for PostgreSQL。

本章大纲

一、PostgreSQL on Win环境搭建

1 环境要求

2 下载PostgreSQL安装包

3 解压PostgreSQL安装包

4 下载pgadmin安装包(可选)

5 安装pgadmin(可选)

6 规划数据文件目录

7 初始化数据库集群

8 配置postgresql.conf

9 配置pg_hba.conf(可选)

10 启动、停止数据库集群

11 如何自动启动数据库集群

12 使用psql 命令行连接数据库

13 新增用户

14 使用psql帮助

15 使用psql语法补齐

16 使用psql sql语法帮助

17 查看当前配置

18 设置会话参数

19 在psql中切换到另一个用户或数据库

20 使用pgadmin4连接数据库

21 文档

二、PostgreSQL on Linux(虚拟机)环境搭建

1 环境要求

2 下载Linux镜像

3 安装VMware Workstation(试用版本)

4 安装securecrt(试用版本)

5 安装Linux虚拟机

6 配置Linux虚拟机网络

7 securecrt终端连接Linux

8 配置linux

9 配置yum仓库(可选)

10 创建普通用户

11 规划数据库存储目录

12 下载PostgreSQL源码

13 安装PostgreSQL

14 配置linux用户环境变量

15 初始化数据库集群

16 配置数据库

17 启动数据库集群

18 连接数据库

19 安装pgadmin(可选)

20 配置pgadmin(可选)

21 使用pgadmin连接数据库(可选)

三、云数据库RDS for PostgreSQL

1 购买云数据库

2 设置并记住RDS for PostgreSQL数据库根用户名和密码

3 配置网络

4 配置白名单

5 本地安装pgadmin(可选)

6 本地配置pgadmin(可选)

7 使用pgadmin连接RDS PostgreSQL数据库(可选)

一、PostgreSQL on Win环境搭建

1 环境要求

Win 7 x64, 8GB以上内存, 4核以上, SSD硬盘(推荐),100GB以上剩余空间, 可以访问公网(10MB/s以上网络带宽)

2 下载PostgreSQL安装包

https://www.postgresql.org/download/windows/

建议下载高级安装包,不需要安装,直接使用。

下载win x64的版本(建议下载最新版本)

http://www.enterprisedb.com/products/pgbindownload.do

例如

https://get.enterprisedb.com/postgresql/postgresql-9.6.2-3-windows-x64-binaries.zip

3 解压PostgreSQL安装包

postgresql-9.6.2-3-windows-x64-binaries.zip

例如解压到d:\pgsql

pic

bin: 二进制文件

doc: 文档

include: 头文件

lib: 动态库

pgAdmin 4: 图形化管理工具

share: 扩展库

StackBuilder: 打包库

symbols: 符号表

4 下载pgadmin安装包(可选)

如果PostgreSQL包中没有包含pgAdmin,建议自行下载一个

建议下载pgadmin4(pgadmin3不再维护)

https://www.pgadmin.org/index.php

https://www.postgresql.org/ftp/pgadmin3/pgadmin4/v1.3/windows/

5 安装pgadmin(可选)

6 规划数据文件目录

例如将D盘的pgdata作为数据库目录。

新建d:\pgdata空目录。

7 初始化数据库集群

以管理员身份打开cmd.exe

pic

>d:  
  
>cd pgsql  
  
>cd bin  
  
>initdb.exe -D d:\pgdata -E UTF8 --locale=C -U postgres  
  
初始化时,指定数据库文件目录,字符集,本地化,数据库超级用户名  

pic

pic

8 配置postgresql.conf

数据库配置文件名字postgresql.conf,这个文件在数据文件目录D:\pgdata中。

将以下内容追加到postgresql.conf文件末尾

listen_addresses = '0.0.0.0'  
port = 1921  
max_connections = 200  
tcp_keepalives_idle = 60  
tcp_keepalives_interval = 10  
tcp_keepalives_count = 6  
shared_buffers = 512MB  
maintenance_work_mem = 64MB  
dynamic_shared_memory_type = windows  
vacuum_cost_delay = 0  
bgwriter_delay = 10ms  
bgwriter_lru_maxpages = 1000  
bgwriter_lru_multiplier = 5.0  
bgwriter_flush_after = 0  
old_snapshot_threshold = -1  
wal_level = minimal  
synchronous_commit = off  
full_page_writes = on  
wal_buffers = 64MB  
wal_writer_delay = 10ms  
wal_writer_flush_after = 4MB  
checkpoint_timeout = 35min  
max_wal_size = 2GB  
min_wal_size = 80MB  
checkpoint_completion_target = 0.1  
checkpoint_flush_after = 0  
random_page_cost = 1.5  
log_destination = 'csvlog'  
logging_collector = on  
log_directory = 'pg_log'  
log_truncate_on_rotation = on  
log_checkpoints = on  
log_connections = on  
log_disconnections = on  
log_error_verbosity = verbose  
log_temp_files = 8192  
log_timezone = 'Asia/Hong_Kong'  
autovacuum = on  
log_autovacuum_min_duration = 0  
autovacuum_naptime = 20s  
autovacuum_vacuum_scale_factor = 0.05  
autovacuum_freeze_max_age = 1500000000  
autovacuum_multixact_freeze_max_age = 1600000000  
autovacuum_vacuum_cost_delay = 0  
vacuum_freeze_table_age = 1400000000  
vacuum_multixact_freeze_table_age = 1500000000  
datestyle = 'iso, mdy'  
timezone = 'Asia/Hong_Kong'  
lc_messages = 'C'  
lc_monetary = 'C'  
lc_numeric = 'C'  
lc_time = 'C'  
default_text_search_config = 'pg_catalog.english'  

9 配置pg_hba.conf(可选)

数据库防火墙文件名字pg_hba.conf,这个文件在数据文件目录D:\pgdata中。

将以下内容追加到文件末尾,表示允许网络用户使用用户密码连接你的postgresql数据库.

host all all 0.0.0.0/0 md5  

10 启动、停止数据库集群

使用命令行启动数据库集群

>d:  
  
>cd pgsql  
  
>cd bin  
  
D:\pgsql\bin>pg_ctl.exe start -D d:\pgdata  
正在启动服务器进程  
  
D:\pgsql\bin>LOG:  00000: redirecting log output to logging collector process  
HINT:  Future log output will appear in directory "pg_log".  
LOCATION:  SysLogger_Start, syslogger.c:622  

使用命令行停止数据库集群

D:\pgsql\bin>pg_ctl.exe stop -m fast -D "d:\pgdata"
等待服务器进程关闭 .... 完成
服务器进程已经关闭

11 如何自动启动数据库集群

配置windows自动启动服务.

12 使用psql 命令行连接数据库

psql -h IP地址 -p 端口 -U 用户名 数据库名

D:\pgsql\bin>psql -h 127.0.0.1 -p 1921 -U postgres postgres  
psql (9.6.2)  
输入 "help" 来获取帮助信息.  
  
postgres=# \dt  

13 新增用户

新建用户属于数据库操作,先使用psql和超级用户postgres连接到数据库。

新增一个普通用户

postgres=# create role digoal login encrypted password 'pwd_digoal';  
CREATE ROLE  

新增一个超级用户

postgres=# create role dba_digoal login superuser encrypted password 'dba_pwd_digoal';  
CREATE ROLE  

新增一个流复制用户

postgres=# create role digoal_rep replication login encrypted password 'pwd';  
CREATE ROLE  

你还可以将一个用户在不同角色之间切换

例如将digoal设置为超级用户

postgres=# alter role digoal superuser;  
ALTER ROLE  

查看已有用户

postgres=# \du+  
                                 角色列表  
  角色名称  |                    属性                    | 成员属于 | 描述  
------------+--------------------------------------------+----------+------  
 dba_digoal | 超级用户                                   | {}       |  
 digoal     | 超级用户                                   | {}       |  
 digoal_rep | 复制                                       | {}       |  
 postgres   | 超级用户, 建立角色, 建立 DB, 复制, 绕过RLS | {}       |  

14 使用psql帮助

psql有很多快捷的命令,使用\?就可以查看。

postgres=# \?  
一般性  
  \copyright            显示PostgreSQL的使用和发行许可条款  
  \errverbose            以最冗长的形式显示最近的错误消息  
  \g [文件] or;     执行查询 (并把结果写入文件或 |管道)  
  \gexec                 执行策略,然后执行其结果中的每个值  
  \gset [PREFIX]     执行查询并把结果存到psql变量中  
  \q             退出 psql  
  \crosstabview [COLUMNS] 执行查询并且以交叉表显示结果  
  \watch [SEC]          每隔SEC秒执行一次查询  
  
帮助  
  \? [commands]          显示反斜线命令的帮助  
  
  ......  
  

15 使用psql语法补齐

如果你编译PostgreSQL使用了补齐选项,那么在psql中按TAB键,可以自动补齐命令。

16 使用psql sql语法帮助

如果你忘记了某个SQL的语法,使用\h 命令即可打印命令的帮助

例如

postgres=# \h create table  
命令:       CREATE TABLE  
描述:       建立新的数据表  
语法:  
CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI  
STS ] 表名 ( [  
  { 列名称 数据_类型 [ COLLATE 校对规则 ] [ 列约束 [ ... ] ]  
    | 表约束  
    | LIKE 源表 [ like选项 ... ] }  
    [, ... ]  
] )  
  
......  

17 查看当前配置

show 参数名

postgres=# show client_encoding;  
 client_encoding  
-----------------  
 GBK  
(1 行记录)  

查看pg_settings

postgres=# select * from pg_settings;  

18 设置会话参数

set 参数名=值;

postgres=# set client_encoding='sql_ascii';  
SET  

19 在psql中切换到另一个用户或数据库

\c 切换到其他用户或数据库

postgres=# \c template1 digoal  
您现在已经连接到数据库 "template1",用户 "digoal".  

20 使用pgadmin4连接数据库

pgAdmin4被安装在这个目录

d:\pgsql\pgAdmin 4\bin  

双击pgAdmin4.exe打开pgadmin4(有点耗时,自动启动HTTPD服务)

点击server,右键,创建server.

配置server别名,连接数据库的 IP,端口,用户,密码,数据库名

pic

21 文档

PostgreSQL的安装包中包含了pgadmin, PostgreSQL的文档,找到对应的doc目录,打开index.html。

二、PostgreSQL on Linux(虚拟机)环境搭建

1 环境要求

Win 7 x64, 8GB以上内存, 4核以上, SSD硬盘(推荐),100GB以上剩余空间, 可以访问公网(10MB/s以上网络带宽)

2 下载Linux镜像

http://isoredirect.centos.org/centos/6/isos/x86_64/

http://mirrors.163.com/centos/6.9/isos/x86_64/CentOS-6.9-x86_64-minimal.iso

3 安装VMware Workstation(试用版本)

http://www.vmware.com/cn/products/workstation/workstation-evaluation.html

4 安装securecrt(试用版本)

securecrt可以用来连接Linux终端,方便使用

https://www.vandyke.com/products/securecrt/windows.html

5 安装Linux虚拟机

打开vmware, 创建虚拟机, 选择CentOS 6 x64版本.

1. 配置建议:

4G内存,40G磁盘,2核以上,NAT网络模式。

2. 安装建议:

minimal最小化安装。

3. root密码:

记住你设置的root密码。

4. Linux安装配置建议

配置主机名,配置网络(根据你的vmware NAT网络进行配置),关闭selinux,关闭防火墙或开放ssh端口(测试环境)。

6 配置Linux虚拟机网络

vmware窗口连接linux

例子,192.168.150 请参考你的vmware NAT网络修改一下。

配置网关

vi /etc/sysconfig/network  
  
NETWORKING=yes  
HOSTNAME=digoal01  
GATEWAY=192.168.150.2  

配置IP

cat /etc/sysconfig/network-scripts/ifcfg-eth0   
  
DEVICE=eth0  
TYPE=Ethernet  
UUID=d28f566a-b0b9-4bde-95e7-20488af19eb6  
ONBOOT=yes  
NM_CONTROLLED=yes  
BOOTPROTO=static  
HWADDR=00:0C:29:5D:6D:9C  
IPADDR=192.168.150.133  
PREFIX=24  
GATEWAY=192.168.150.2  
DNS1=192.168.150.2  
DEFROUTE=yes  
IPV4_FAILURE_FATAL=yes  
IPV6INIT=no  
NAME="System eth0"  

配置DNS

cat /etc/resolv.conf  
  
nameserver 192.168.150.2  

重启网络服务

service network restart  

7 securecrt终端连接Linux

添加一个session,连接到Linux虚拟机。

pic

8 配置linux

1. /etc/sysctl.conf

vi /etc/sysctl.conf  
  
追加到文件末尾  
  
kernel.shmall = 4294967296  
kernel.shmmax=135497418752  
kernel.shmmni = 4096  
kernel.sem = 50100 64128000 50100 1280  
fs.file-max = 7672460  
fs.aio-max-nr = 1048576  
net.ipv4.ip_local_port_range = 9000 65000  
net.core.rmem_default = 262144  
net.core.rmem_max = 4194304  
net.core.wmem_default = 262144  
net.core.wmem_max = 4194304  
net.ipv4.tcp_max_syn_backlog = 4096  
net.core.netdev_max_backlog = 10000  
net.ipv4.netfilter.ip_conntrack_max = 655360  
net.ipv4.tcp_timestamps = 0  
net.ipv4.tcp_tw_recycle=1  
net.ipv4.tcp_timestamps=1  
net.ipv4.tcp_keepalive_time = 72   
net.ipv4.tcp_keepalive_probes = 9   
net.ipv4.tcp_keepalive_intvl = 7  
vm.zone_reclaim_mode=0  
vm.dirty_background_bytes = 40960000  
vm.dirty_ratio = 80  
vm.dirty_expire_centisecs = 6000  
vm.dirty_writeback_centisecs = 50  
vm.swappiness=0  
vm.overcommit_memory = 0  
vm.overcommit_ratio = 90  

生效

sysctl -p  

2. /etc/security/limits.conf

vi /etc/security/limits.conf   
  
* soft    nofile  131072  
* hard    nofile  131072  
* soft    nproc   131072  
* hard    nproc   131072  
* soft    core    unlimited  
* hard    core    unlimited  
* soft    memlock 500000000  
* hard    memlock 500000000  

3. /etc/security/limits.d/*

rm -f /etc/security/limits.d/*  

4. 关闭selinux

# vi /etc/sysconfig/selinux   
  
SELINUX=disabled  
SELINUXTYPE=targeted  

5. 配置OS防火墙
(建议按业务场景设置,我这里先清掉)

iptables -F  

配置范例

# 私有网段  
-A INPUT -s 192.168.0.0/16 -j ACCEPT  
-A INPUT -s 10.0.0.0/8 -j ACCEPT  
-A INPUT -s 172.16.0.0/16 -j ACCEPT  

重启linux。

reboot  

9 配置yum仓库(可选)

在linux虚拟机中,找一个有足够空间的分区,下载ISO镜像

wget http://mirrors.163.com/centos/6.9/isos/x86_64/CentOS-6.9-x86_64-bin-DVD1.iso  
  
wget http://mirrors.163.com/centos/6.9/isos/x86_64/CentOS-6.9-x86_64-bin-DVD2.iso  

新建ISO挂载点目录

mkdir /mnt/cdrom1  
mkdir /mnt/cdrom2  

挂载ISO

mount -o loop,defaults,ro /u01/CentOS-6.8-x86_64-bin-DVD1.iso /mnt/cdrom1  
mount -o loop,defaults,ro /u01/CentOS-6.8-x86_64-bin-DVD2.iso /mnt/cdrom2  

备份并删除原有的YUM配置文件

mkdir /tmp/yum.bak  
cd /etc/yum.repos.d/  
mv * /tmp/yum.bak/  

新增YUM配置文件

cd /etc/yum.repos.d/  
  
vi local.repo  
  
[local-yum]  
name=Local Repository  
baseurl=file:///mnt/cdrom1  
enabled=1  
gpgcheck=0  

刷新YUM缓存

yum clean all  

测试

yum list  
  
yum install createrepo   -- 方便后面测试  

修改YUM配置,修改路径为上层目录

cd /etc/yum.repos.d/  
  
vi local.repo  
  
[local-yum]  
name=Local Repository  
baseurl=file:///mnt/  
enabled=1  
gpgcheck=0  

创建YUM索引

cd /mnt/  
createrepo .  

刷新YUM缓存,测试

yum clean all  
  
yum list  
  
yum install vim  

10 创建普通用户

useradd digoal  

11 规划数据库存储目录

假设/home分区有足够的空间, /home/digoal/pgdata规划为数据文件目录

Filesystem      Size  Used Avail Use% Mounted on  
/dev/sda3        14G  5.7G  7.2G  45% /  

12 下载PostgreSQL源码

https://www.postgresql.org/ftp/source/

su - digoal  
  
wget https://ftp.postgresql.org/pub/source/v9.6.2/postgresql-9.6.2.tar.bz2  

13 安装PostgreSQL

安装依赖包

root用户下,使用yum 安装依赖包  
  
yum -y install coreutils glib2 lrzsz mpstat dstat sysstat e4fsprogs xfsprogs ntp readline-devel zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl-devel perl-Ext  
Utils* openldap-devel jadetex  openjade bzip2  

编译安装PostgreSQL

digoal用户下,编译安装PostgreSQL  
  
tar -jxvf postgresql-9.6.2.tar.bz2  
cd postgresql-9.6.2  
./configure --prefix=/home/digoal/pgsql9.6  
make world -j 8  
make install-world  

14 配置linux用户环境变量

digoal用户下,配置环境变量

su - digoal  
vi ~/.bash_profile  
  
追加  
  
export PS1="$USER@`/bin/hostname -s`-> "  
export PGPORT=1921  
export PGDATA=/home/digoal/pgdata  
export LANG=en_US.utf8  
export PGHOME=/home/digoal/pgsql9.6  
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH  
export PATH=$PGHOME/bin:$PATH:.  
export DATE=`date +"%Y%m%d%H%M"`  
export MANPATH=$PGHOME/share/man:$MANPATH  
export PGHOST=$PGDATA  
export PGUSER=postgres  
export PGDATABASE=postgres  
alias rm='rm -i'  
alias ll='ls -lh'  
unalias vi  

重新登录digoal用户,配置生效

exit  
  
su - digoal  

15 初始化数据库集群

initdb -D $PGDATA -E UTF8 --locale=C -U postgres  

16 配置数据库

配置文件在$PGDATA目录中

1. 配置postgresql.conf

追加  
  
listen_addresses = '0.0.0.0'  
port = 1921  
max_connections = 200  
unix_socket_directories = '.'  
tcp_keepalives_idle = 60  
tcp_keepalives_interval = 10  
tcp_keepalives_count = 10  
shared_buffers = 512MB  
dynamic_shared_memory_type = posix  
vacuum_cost_delay = 0  
bgwriter_delay = 10ms  
bgwriter_lru_maxpages = 1000  
bgwriter_lru_multiplier = 10.0  
bgwriter_flush_after = 0   
old_snapshot_threshold = -1  
backend_flush_after = 0   
wal_level = minimal  
synchronous_commit = off  
full_page_writes = on  
wal_buffers = 16MB  
wal_writer_delay = 10ms  
wal_writer_flush_after = 0   
checkpoint_timeout = 30min   
max_wal_size = 2GB  
min_wal_size = 128MB  
checkpoint_completion_target = 0.05    
checkpoint_flush_after = 0    
random_page_cost = 1.3   
log_destination = 'csvlog'  
logging_collector = on  
log_truncate_on_rotation = on  
log_checkpoints = on  
log_connections = on  
log_disconnections = on  
log_error_verbosity = verbose  
autovacuum = on  
log_autovacuum_min_duration = 0  
autovacuum_naptime = 20s  
autovacuum_vacuum_scale_factor = 0.05  
autovacuum_freeze_max_age = 1500000000  
autovacuum_multixact_freeze_max_age = 1600000000  
autovacuum_vacuum_cost_delay = 0  
vacuum_freeze_table_age = 1400000000  
vacuum_multixact_freeze_table_age = 1500000000  
datestyle = 'iso, mdy'  
timezone = 'PRC'  
lc_messages = 'C'  
lc_monetary = 'C'  
lc_numeric = 'C'  
lc_time = 'C'  
default_text_search_config = 'pg_catalog.english'  
shared_preload_libraries='pg_stat_statements'  

2. 配置pg_hba.conf

追加  
  
host all all 0.0.0.0/0 md5  

17 启动数据库集群

su - digoal  
  
pg_ctl start  

18 连接数据库

su - digoal  
  
psql  
psql (9.6.2)  
Type "help" for help.  
  
postgres=#   

19 安装pgadmin(可选)

在windows 机器上,安装pgadmin

https://www.pgadmin.org/download/windows4.php

20 配置pgadmin(可选)

参考章节1

21 使用pgadmin连接数据库(可选)

参考章节1

三、云数据库RDS for PostgreSQL

1 购买云数据库

https://www.aliyun.com/product/rds/postgresql

2 设置并记住RDS for PostgreSQL数据库根用户名和密码

在RDS 控制台操作。

3 配置网络

在RDS 控制台操作,配置连接数据库的URL和端口。

4 配置白名单

在RDS 控制台操作,配置来源IP的白名单,如果来源IP为动态IP,白名单设置为0.0.0.0。

(数据库开放公网连接有风险,请谨慎设置,本文仅为测试环境。)

5 本地安装pgadmin(可选)

在windows 机器上,安装pgadmin

https://www.pgadmin.org/download/windows4.php

6 本地配置pgadmin(可选)

参考章节1

7 使用pgadmin连接RDS PostgreSQL数据库(可选)

参考章节1

mongodb 数据库操作–备份 还原 导出 导入

mongodb数据备份和还原主要分为二种,一种是针对于库的mongodump和mongorestore,一种是针对库中表的mongoexport和mongoimport。

一,mongodump备份数据库

1,常用命令格

mongodump -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 -o 文件存在路径 

如果没有用户谁,可以去掉-u和-p。
如果导出本机的数据库,可以去掉-h。
如果是默认端口,可以去掉–port。
如果想导出所有数据库,可以去掉-d。

2,导出所有数据库

[root@localhost mongodb]# mongodump -h 127.0.0.1 -o /home/zhangy/mongodb/ 
connected to: 127.0.0.1 
Tue Dec 3 06:15:55.448 all dbs 
Tue Dec 3 06:15:55.449 DATABASE: test   to   /home/zhangy/mongodb/test 
Tue Dec 3 06:15:55.449   test.system.indexes to /home/zhangy/mongodb/test/system.indexes.bson 
Tue Dec 3 06:15:55.450     1 objects 
Tue Dec 3 06:15:55.450   test.posts to /home/zhangy/mongodb/test/posts.bson 
Tue Dec 3 06:15:55.480     0 objects 
 
。。。。。。。。。。。。。。。。。。。。省略。。。。。。。。。。。。。。。。。。。。。。。。。。 

3,导出指定数据库

[root@localhost mongodb]# mongodump -h 192.168.1.108 -d tank -o /home/zhangy/mongodb/ 
connected to: 192.168.1.108 
Tue Dec 3 06:11:41.618 DATABASE: tank   to   /home/zhangy/mongodb/tank 
Tue Dec 3 06:11:41.623   tank.system.indexes to /home/zhangy/mongodb/tank/system.indexes.bson 
Tue Dec 3 06:11:41.623     2 objects 
Tue Dec 3 06:11:41.623   tank.contact to /home/zhangy/mongodb/tank/contact.bson 
Tue Dec 3 06:11:41.669     2 objects 
Tue Dec 3 06:11:41.670   Metadata for tank.contact to /home/zhangy/mongodb/tank/contact.metadata.json 
Tue Dec 3 06:11:41.670   tank.users to /home/zhangy/mongodb/tank/users.bson 
Tue Dec 3 06:11:41.685     2 objects 
Tue Dec 3 06:11:41.685   Metadata for tank.users to /home/zhangy/mongodb/tank/users.metadata.json 

三,mongorestore还原数据库

1,常用命令格式

mongorestore -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 --drop 文件存在路径

–drop的意思是,先删除所有的记录,然后恢复。

2,恢复所有数据库到mongodb中

[root@localhost mongodb]# mongorestore /home/zhangy/mongodb/  #这里的路径是所有库的备份路径

3,还原指定的数据库

[root@localhost mongodb]# mongorestore -d tank /home/zhangy/mongodb/tank/  #tank这个数据库的备份路径 
 
[root@localhost mongodb]# mongorestore -d tank_new /home/zhangy/mongodb/tank/  #将tank还有tank_new数据库中

这二个命令,可以实现数据库的备份与还原,文件格式是json和bson的。无法指写到表备份或者还原。

四,mongoexport导出表,或者表中部分字段

1,常用命令格式

mongoexport -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 -c 表名 -f 字段 -q 条件导出 --csv -o 文件名 

上面的参数好理解,重点说一下:
-f    导出指字段,以字号分割,-f name,email,age导出name,email,age这三个字段
-q    可以根查询条件导出,-q ‘{ “uid” : “100” }’ 导出uid为100的数据
–csv 表示导出的文件格式为csv的,这个比较有用,因为大部分的关系型数据库都是支持csv,在这里有共同点

2,导出整张表

[root@localhost mongodb]# mongoexport -d tank -c users -o /home/zhangy/mongodb/tank/users.dat 
connected to: 127.0.0.1 
exported 4 records 

3,导出表中部分字段

[root@localhost mongodb]# mongoexport -d tank -c users --csv -f uid,name,sex -o tank/users.csv 
connected to: 127.0.0.1 
exported 4 records 

4,根据条件敢出数据

[root@localhost mongodb]# mongoexport -d tank -c users -q '{uid:{$gt:1}}' -o tank/users.json 
connected to: 127.0.0.1 
exported 3 records 

五,mongoimport导入表,或者表中部分字段

1,常用命令格式

1.1,还原整表导出的非csv文件
mongoimport -h IP –port 端口 -u 用户名 -p 密码 -d 数据库 -c 表名 –upsert –drop 文件名
重点说一下–upsert,其他参数上面的命令已有提到,–upsert 插入或者更新现有数据
1.2,还原部分字段的导出文件
mongoimport -h IP –port 端口 -u 用户名 -p 密码 -d 数据库 -c 表名 –upsertFields 字段 –drop 文件名
–upsertFields根–upsert一样
1.3,还原导出的csv文件
mongoimport -h IP –port 端口 -u 用户名 -p 密码 -d 数据库 -c 表名 –type 类型 –headerline –upsert –drop 文件名
上面三种情况,还可以有其他排列组合的。

2,还原导出的表数据

[root@localhost mongodb]# mongoimport -d tank -c users --upsert tank/users.dat 
connected to: 127.0.0.1 
Tue Dec 3 08:26:52.852 imported 4 objects

3,部分字段的表数据导入

[root@localhost mongodb]# mongoimport -d tank -c users  –upsertFields uid,name,sex  tank/users.dat
connected to: 127.0.0.1
Tue Dec  3 08:31:15.179 imported 4 objects

4,还原csv文件

[root@localhost mongodb]# mongoimport -d tank -c users --type csv --headerline --file tank/users.csv 
connected to: 127.0.0.1 
Tue Dec 3 08:37:21.961 imported 4 objects 

总体感觉,mongodb的备份与还原,还是挺强大的,虽然有点麻烦。